SlideShare a Scribd company logo
1 of 15
Stream Upload And
Asynchronous Job
Processing System
Lê Bá Minh – minhlb@vng.com.vn
Technical Manager – Zalo Team - VNG
Agenda
• 1/ Why we need an Asynchronous Job Processing
System?
• 2/ How it works ?
• 3/ Application
• 4/ Q &A
Parallel Stream Upload
• Data is separated in chunks
Facts
• Zalo Stream Upload
• Background continuous Voice Upload
• Background Image upload
• …
• Facts (now)
• 1M voices /day
• 800K images /day
• Peak: 500 Chunks/second
• Expect:
• Scalable (more than 5000 chunks/second)
• High performance
What we need
• Asynchronous Job processing System
Collect Data
Processing Data
Response
Collect Data
Processing DataResponse
Workers
What we need
• Asynchronous Job processing System
• Batch Job
• Big data job
• High Reliable: No job missed
• Distributed job processing workers
• High performance
• Persistent
• Load balancing, Failed over, Recoverable
Open-source solutions
• Share-memory workers
• All workers in one physical server
• No fail-over
• Un-scalable
• Gearman
• Good but not completely fit our requirement
• No Batch Job support
• Not full reliable (lost job)
• Not full load-balance
• Un-stable if more than 2000 jobs/second
Zalo Asyn Job Processing
System
Client
Client
Worker 1
Worker 2
Worker 3
Z Database
Short Connection
Long Connection
TCP
TCP
Worker
Manager
Job
Caching
Job
Manager
Persistent
Manager
Job
Clean-Up
Job Server
TCP
TCP
TCP
Implementation
• C/C++ for Job Server
• C/C++, Java for client and workers
• Binary Protocol
• Z-Database
Job State
Queuing
Processing
Failed Time Out
Finished
Deliver to Worker
Worker ACK Failed
Worker ACK Finished
No ACK
Started
Job Type
• Single Job
• Simple task
• Immediately deliver
• Batch Job
• Multiple tasks
• Deliver when received all tasks
Deployment
Job Server 1
Job Server 2
Synchronized
Business Server
Worker 1
Worker 2
Worker 3
Applications
• Using for all Asynchronous job processing in Zalo: voice
upload, image upload, feed processing…
• Benchmark (single server)
• 50K images/seconds (640x480)
• 50k voices/seconds (30s)
• Advantages
• Batch Jobs
• Never lost job
• Worker can restart or stop any time
• Fail-over, Load Balancing, Quick recover in failure
• Issue
• Job duplication (handled by worker)
Q&A
Stream upload and asynchronous job processing  in large scale systems

More Related Content

What's hot

Cloud computing presentation by Pinky
Cloud computing presentation by PinkyCloud computing presentation by Pinky
Cloud computing presentation by Pinky
Pinky Gupta
 

What's hot (16)

A Bird and the Web
A Bird and the WebA Bird and the Web
A Bird and the Web
 
Walmart pagespeed-slide
Walmart pagespeed-slideWalmart pagespeed-slide
Walmart pagespeed-slide
 
Performance metrics for a social network
Performance metrics for a social networkPerformance metrics for a social network
Performance metrics for a social network
 
Connected and continuous localization systems for content management systems
Connected and continuous localization systems for content management systemsConnected and continuous localization systems for content management systems
Connected and continuous localization systems for content management systems
 
Fashiolista
FashiolistaFashiolista
Fashiolista
 
Cloud computing presentation by Pinky
Cloud computing presentation by PinkyCloud computing presentation by Pinky
Cloud computing presentation by Pinky
 
Your app works slowly. Now what?
Your app works slowly. Now what?Your app works slowly. Now what?
Your app works slowly. Now what?
 
noSQL choices
noSQL choicesnoSQL choices
noSQL choices
 
SenchaCon 2016: Creating a Flexible and Usable Industry Specific Solution - D...
SenchaCon 2016: Creating a Flexible and Usable Industry Specific Solution - D...SenchaCon 2016: Creating a Flexible and Usable Industry Specific Solution - D...
SenchaCon 2016: Creating a Flexible and Usable Industry Specific Solution - D...
 
MongoDB World 2018: Using Puppet, Ansible and Ops Manager to Create Your Own ...
MongoDB World 2018: Using Puppet, Ansible and Ops Manager to Create Your Own ...MongoDB World 2018: Using Puppet, Ansible and Ops Manager to Create Your Own ...
MongoDB World 2018: Using Puppet, Ansible and Ops Manager to Create Your Own ...
 
Basecamp presentation
Basecamp presentationBasecamp presentation
Basecamp presentation
 
ACM Patterns and Oracle BPM Suite Best Practises
ACM Patterns and Oracle BPM Suite Best PractisesACM Patterns and Oracle BPM Suite Best Practises
ACM Patterns and Oracle BPM Suite Best Practises
 
ppt-basecamp
ppt-basecampppt-basecamp
ppt-basecamp
 
Reactive All the Way Down the Stack
Reactive All the Way Down the StackReactive All the Way Down the Stack
Reactive All the Way Down the Stack
 
Greach 2018: Surviving Microservices
Greach 2018: Surviving MicroservicesGreach 2018: Surviving Microservices
Greach 2018: Surviving Microservices
 
Autobahn primer
Autobahn primerAutobahn primer
Autobahn primer
 

Viewers also liked

Experience lessons from architecture of zalo real time system
Experience lessons from architecture of zalo real time systemExperience lessons from architecture of zalo real time system
Experience lessons from architecture of zalo real time system
Zalo_app
 
Tips and tricks to attack memory problem in android programming
Tips and tricks to attack memory problem in android programmingTips and tricks to attack memory problem in android programming
Tips and tricks to attack memory problem in android programming
Zalo_app
 
Advance Android Programming - learning beyond basics
Advance Android Programming - learning beyond basicsAdvance Android Programming - learning beyond basics
Advance Android Programming - learning beyond basics
ayman diab
 
Memory problems in android programming
Memory problems in android programmingMemory problems in android programming
Memory problems in android programming
AiTi Education
 
Design a scalable social network: Problems and solutions
Design a scalable social network: Problems and solutionsDesign a scalable social network: Problems and solutions
Design a scalable social network: Problems and solutions
Chau Thanh
 

Viewers also liked (20)

Experience lessons from architecture of zalo real time system
Experience lessons from architecture of zalo real time systemExperience lessons from architecture of zalo real time system
Experience lessons from architecture of zalo real time system
 
Tips and tricks to attack memory problem in android programming
Tips and tricks to attack memory problem in android programmingTips and tricks to attack memory problem in android programming
Tips and tricks to attack memory problem in android programming
 
Inside Zalo: Developing a mobile messenger for the audience of millions - VN ...
Inside Zalo: Developing a mobile messenger for the audience of millions - VN ...Inside Zalo: Developing a mobile messenger for the audience of millions - VN ...
Inside Zalo: Developing a mobile messenger for the audience of millions - VN ...
 
Sơ lược kiến trúc hệ thống Zing Me
Sơ lược kiến trúc hệ thống Zing MeSơ lược kiến trúc hệ thống Zing Me
Sơ lược kiến trúc hệ thống Zing Me
 
Building ZingMe News Feed System
Building ZingMe News Feed SystemBuilding ZingMe News Feed System
Building ZingMe News Feed System
 
Zalo business presentation 29.09 eqvn update
Zalo business   presentation 29.09 eqvn updateZalo business   presentation 29.09 eqvn update
Zalo business presentation 29.09 eqvn update
 
Mobile adnetwork in vietnam
Mobile adnetwork in vietnamMobile adnetwork in vietnam
Mobile adnetwork in vietnam
 
Non Conventional Android Programming (English)
Non Conventional Android Programming (English)Non Conventional Android Programming (English)
Non Conventional Android Programming (English)
 
Advance Android application development workshop day 1
Advance Android application development workshop day 1Advance Android application development workshop day 1
Advance Android application development workshop day 1
 
Advance Android Programming - learning beyond basics
Advance Android Programming - learning beyond basicsAdvance Android Programming - learning beyond basics
Advance Android Programming - learning beyond basics
 
Android Workshop 2013
Android Workshop 2013Android Workshop 2013
Android Workshop 2013
 
Memory problems in android programming
Memory problems in android programmingMemory problems in android programming
Memory problems in android programming
 
Software proposal on android
Software proposal on androidSoftware proposal on android
Software proposal on android
 
Design a scalable social network: Problems and solutions
Design a scalable social network: Problems and solutionsDesign a scalable social network: Problems and solutions
Design a scalable social network: Problems and solutions
 
Asynchronous Programming in Android
Asynchronous Programming in AndroidAsynchronous Programming in Android
Asynchronous Programming in Android
 
A Short Report on MQTT protocol for Internet of Things(IoT)
A Short Report on MQTT protocol for Internet of Things(IoT)A Short Report on MQTT protocol for Internet of Things(IoT)
A Short Report on MQTT protocol for Internet of Things(IoT)
 
Android UI Reference
Android UI ReferenceAndroid UI Reference
Android UI Reference
 
Advance Android application development workshop day 3
Advance Android application development workshop day 3Advance Android application development workshop day 3
Advance Android application development workshop day 3
 
Web Development Fundamentals
Web Development FundamentalsWeb Development Fundamentals
Web Development Fundamentals
 
Tất tần tật về zalo page
Tất tần tật về zalo pageTất tần tật về zalo page
Tất tần tật về zalo page
 

Similar to Stream upload and asynchronous job processing in large scale systems

Data Care, Feeding, and Maintenance
Data Care, Feeding, and MaintenanceData Care, Feeding, and Maintenance
Data Care, Feeding, and Maintenance
Mercedes Coyle
 
Priority enabled wps
Priority enabled wpsPriority enabled wps
Priority enabled wps
52North
 
Engage 2013 - Leveraging Ad Hoc Analysis
Engage 2013 - Leveraging Ad Hoc AnalysisEngage 2013 - Leveraging Ad Hoc Analysis
Engage 2013 - Leveraging Ad Hoc Analysis
Webtrends
 

Similar to Stream upload and asynchronous job processing in large scale systems (20)

Management Data Warehouse
Management Data WarehouseManagement Data Warehouse
Management Data Warehouse
 
Ahmed Jassat Oracle Customer Day Presentation at Monte Casino
Ahmed Jassat Oracle Customer Day Presentation at Monte CasinoAhmed Jassat Oracle Customer Day Presentation at Monte Casino
Ahmed Jassat Oracle Customer Day Presentation at Monte Casino
 
Moving from Snapshot to Snapshot
Moving from Snapshot to SnapshotMoving from Snapshot to Snapshot
Moving from Snapshot to Snapshot
 
Hands-on Performance Tuning Lab - Devoxx Poland
Hands-on Performance Tuning Lab - Devoxx PolandHands-on Performance Tuning Lab - Devoxx Poland
Hands-on Performance Tuning Lab - Devoxx Poland
 
Data Care, Feeding, and Maintenance
Data Care, Feeding, and MaintenanceData Care, Feeding, and Maintenance
Data Care, Feeding, and Maintenance
 
Real time monitoring of hadoop and spark workflows
Real time monitoring of hadoop and spark workflowsReal time monitoring of hadoop and spark workflows
Real time monitoring of hadoop and spark workflows
 
Maxis Alchemize imug 2017
Maxis Alchemize imug 2017Maxis Alchemize imug 2017
Maxis Alchemize imug 2017
 
Background processing with hangfire
Background processing with hangfireBackground processing with hangfire
Background processing with hangfire
 
Monitoring at Facebook - Ran Leibman, Facebook - DevOpsDays Tel Aviv 2015
Monitoring at Facebook - Ran Leibman, Facebook - DevOpsDays Tel Aviv 2015Monitoring at Facebook - Ran Leibman, Facebook - DevOpsDays Tel Aviv 2015
Monitoring at Facebook - Ran Leibman, Facebook - DevOpsDays Tel Aviv 2015
 
Building data intensive applications
Building data intensive applicationsBuilding data intensive applications
Building data intensive applications
 
Priority enabled wps
Priority enabled wpsPriority enabled wps
Priority enabled wps
 
DevOpsDays Austin: Helping Horses Become Unicorns, Chef's Operations Maturity...
DevOpsDays Austin: Helping Horses Become Unicorns, Chef's Operations Maturity...DevOpsDays Austin: Helping Horses Become Unicorns, Chef's Operations Maturity...
DevOpsDays Austin: Helping Horses Become Unicorns, Chef's Operations Maturity...
 
Overview of Scientific Workflows - Why Use Them?
Overview of Scientific Workflows - Why Use Them?Overview of Scientific Workflows - Why Use Them?
Overview of Scientific Workflows - Why Use Them?
 
In Transit Images Drives Online Photography Business Forward with DAM
In Transit Images Drives Online Photography Business Forward with DAMIn Transit Images Drives Online Photography Business Forward with DAM
In Transit Images Drives Online Photography Business Forward with DAM
 
ICONUK 2016: Back From the Dead: How Bad Code Kills a Good Server
ICONUK 2016: Back From the Dead: How Bad Code Kills a Good ServerICONUK 2016: Back From the Dead: How Bad Code Kills a Good Server
ICONUK 2016: Back From the Dead: How Bad Code Kills a Good Server
 
EPUG UKI - Lancaster Analytics
EPUG UKI - Lancaster AnalyticsEPUG UKI - Lancaster Analytics
EPUG UKI - Lancaster Analytics
 
Hadoop bangalore-meetup-dec-2011-yoda
Hadoop bangalore-meetup-dec-2011-yodaHadoop bangalore-meetup-dec-2011-yoda
Hadoop bangalore-meetup-dec-2011-yoda
 
Engage 2013 - Leveraging Ad Hoc Analysis
Engage 2013 - Leveraging Ad Hoc AnalysisEngage 2013 - Leveraging Ad Hoc Analysis
Engage 2013 - Leveraging Ad Hoc Analysis
 
Zapping ever faster: how Zap sped up by two orders of magnitude using RavenDB
Zapping ever faster: how Zap sped up by two orders of magnitude using RavenDBZapping ever faster: how Zap sped up by two orders of magnitude using RavenDB
Zapping ever faster: how Zap sped up by two orders of magnitude using RavenDB
 
Mentor Graphics Customer Presentation
Mentor Graphics Customer PresentationMentor Graphics Customer Presentation
Mentor Graphics Customer Presentation
 

Recently uploaded

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 

Recently uploaded (20)

Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 

Stream upload and asynchronous job processing in large scale systems

  • 1. Stream Upload And Asynchronous Job Processing System Lê Bá Minh – minhlb@vng.com.vn Technical Manager – Zalo Team - VNG
  • 2. Agenda • 1/ Why we need an Asynchronous Job Processing System? • 2/ How it works ? • 3/ Application • 4/ Q &A
  • 3. Parallel Stream Upload • Data is separated in chunks
  • 4. Facts • Zalo Stream Upload • Background continuous Voice Upload • Background Image upload • … • Facts (now) • 1M voices /day • 800K images /day • Peak: 500 Chunks/second • Expect: • Scalable (more than 5000 chunks/second) • High performance
  • 5. What we need • Asynchronous Job processing System Collect Data Processing Data Response Collect Data Processing DataResponse Workers
  • 6. What we need • Asynchronous Job processing System • Batch Job • Big data job • High Reliable: No job missed • Distributed job processing workers • High performance • Persistent • Load balancing, Failed over, Recoverable
  • 7. Open-source solutions • Share-memory workers • All workers in one physical server • No fail-over • Un-scalable • Gearman • Good but not completely fit our requirement • No Batch Job support • Not full reliable (lost job) • Not full load-balance • Un-stable if more than 2000 jobs/second
  • 8. Zalo Asyn Job Processing System Client Client Worker 1 Worker 2 Worker 3 Z Database Short Connection Long Connection TCP TCP Worker Manager Job Caching Job Manager Persistent Manager Job Clean-Up Job Server TCP TCP TCP
  • 9. Implementation • C/C++ for Job Server • C/C++, Java for client and workers • Binary Protocol • Z-Database
  • 10. Job State Queuing Processing Failed Time Out Finished Deliver to Worker Worker ACK Failed Worker ACK Finished No ACK Started
  • 11. Job Type • Single Job • Simple task • Immediately deliver • Batch Job • Multiple tasks • Deliver when received all tasks
  • 12. Deployment Job Server 1 Job Server 2 Synchronized Business Server Worker 1 Worker 2 Worker 3
  • 13. Applications • Using for all Asynchronous job processing in Zalo: voice upload, image upload, feed processing… • Benchmark (single server) • 50K images/seconds (640x480) • 50k voices/seconds (30s) • Advantages • Batch Jobs • Never lost job • Worker can restart or stop any time • Fail-over, Load Balancing, Quick recover in failure • Issue • Job duplication (handled by worker)
  • 14. Q&A