SlideShare une entreprise Scribd logo
1  sur  36
Adopting actors
An epic tail of loss and learning
Iain Hull
iain.hull@workday.com
@IainHull
http://workday.github.io
Workday
Growth
2013 2014 2015 2016
Cloud Master
Launch tasks Assign to agents
Cloud Master
Launch tasks Assign to agents
Service Growth
in millions of tasks per month
0
5
10
15
20
Print
Large
Small
Batch
Why Akka?
Initial Observations
Parent
Config Child
Snapshots
Changes
Parent
Config Child
Snapshots
Changes
Message flow:
Ensure messages follow a consistent path
Parent
Config Child
Snapshots
Changes
Creation:
Assume actor is recovering from failure
(state machine)
Anti-patterns
God
Class
Movie Star
Pool
Agent
State
Agent Agent Agent Agent
Queu
e
Movie Star
Too much state
• Hard to reason about
• Too many messages in flight
• Hard to recover
• Bad concurrency
Split Brain
Pool
Agent
State
Agent Agent Agent Agent
Duplicate state
Single source of truth
• Synchronizing state is hard
• Failure causes
–State out of sync
–Causes more failure
Split Brain
Pool
Agent
State
Agent Agent Agent Agent
Task
Passing responsibility
Seems simple at first
• Do not always know who is in control
• Both actors updating the same row
• Creates race conditions
Can you
let it crash?
Pool
Agent
State
Agent Agent Agent Agent
Can you let it crash?
Lessons
Test for resilience
• Chaos Marmoset
• Unit test recovery
• Destructive system test
Stateless
Enterprise
idioms
do not apply
Sovereignty
One actor
• One row
• One shard
• One table
Otherwise failure hard to handle
Atomicity
Actors
Atomic receive method
State not shared
Comms async messages
Not nestable
Mutex
Atomic scope
State is shared
Comms via mutable state
Nestable (ACID)
Atomicity
Anything!!! Nothing
Actors Mutex
Pool
Agent
State
Agent Agent Agent Agent
Atomicity
Eventual
consistency
Lessons
- Atomicity and Consistency
- Actor modeling ≠ Object modeling
- Test for Resilience not robustness
- Refactor Early
Adopting Actors: An epic tail of loss and learning

Contenu connexe

Similaire à Adopting Actors: An epic tail of loss and learning

Failure the-good-parts
Failure the-good-partsFailure the-good-parts
Failure the-good-parts
legendofklang
 
Indic threads java10-spring-roo-and-the-cloud
Indic threads java10-spring-roo-and-the-cloudIndic threads java10-spring-roo-and-the-cloud
Indic threads java10-spring-roo-and-the-cloud
Shekhar Gulati
 

Similaire à Adopting Actors: An epic tail of loss and learning (10)

mri-bp2015
mri-bp2015mri-bp2015
mri-bp2015
 
Failure the-good-parts
Failure the-good-partsFailure the-good-parts
Failure the-good-parts
 
Transitioning Android Teams Into Kotlin
Transitioning Android Teams Into KotlinTransitioning Android Teams Into Kotlin
Transitioning Android Teams Into Kotlin
 
Kotlin for Android - Goto Copenhagan 2019
Kotlin for Android - Goto Copenhagan 2019Kotlin for Android - Goto Copenhagan 2019
Kotlin for Android - Goto Copenhagan 2019
 
Stop Worrying about Prodweb001 and Start Loving i-98fb9856 (ARC201) | AWS re:...
Stop Worrying about Prodweb001 and Start Loving i-98fb9856 (ARC201) | AWS re:...Stop Worrying about Prodweb001 and Start Loving i-98fb9856 (ARC201) | AWS re:...
Stop Worrying about Prodweb001 and Start Loving i-98fb9856 (ARC201) | AWS re:...
 
Root cause analysis
Root cause analysisRoot cause analysis
Root cause analysis
 
Indic threads java10-spring-roo-and-the-cloud
Indic threads java10-spring-roo-and-the-cloudIndic threads java10-spring-roo-and-the-cloud
Indic threads java10-spring-roo-and-the-cloud
 
Spring Roo and the Cloud (Tutorial) [5th IndicThreads.com Conference On Java,...
Spring Roo and the Cloud (Tutorial) [5th IndicThreads.com Conference On Java,...Spring Roo and the Cloud (Tutorial) [5th IndicThreads.com Conference On Java,...
Spring Roo and the Cloud (Tutorial) [5th IndicThreads.com Conference On Java,...
 
React Native - Why Designers should use React native. And everyone else too.
React Native - Why Designers should use React native. And everyone else too.React Native - Why Designers should use React native. And everyone else too.
React Native - Why Designers should use React native. And everyone else too.
 
Akka.Net Ottawa .NET User Group Meetup
Akka.Net Ottawa .NET User Group Meetup Akka.Net Ottawa .NET User Group Meetup
Akka.Net Ottawa .NET User Group Meetup
 

Dernier

+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
Health
 
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
VictorSzoltysek
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
mohitmore19
 

Dernier (20)

Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
 
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdfThe Ultimate Test Automation Guide_ Best Practices and Tips.pdf
The Ultimate Test Automation Guide_ Best Practices and Tips.pdf
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTV
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
 
How to Choose the Right Laravel Development Partner in New York City_compress...
How to Choose the Right Laravel Development Partner in New York City_compress...How to Choose the Right Laravel Development Partner in New York City_compress...
How to Choose the Right Laravel Development Partner in New York City_compress...
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learn
 
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
Direct Style Effect Systems -The Print[A] Example- A Comprehension AidDirect Style Effect Systems -The Print[A] Example- A Comprehension Aid
Direct Style Effect Systems - The Print[A] Example - A Comprehension Aid
 
A Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docxA Secure and Reliable Document Management System is Essential.docx
A Secure and Reliable Document Management System is Essential.docx
 
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.js
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf
 
Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdfMicrosoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdf
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
 
TECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service providerTECUNIQUE: Success Stories: IT Service provider
TECUNIQUE: Success Stories: IT Service provider
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview Questions
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
 
10 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 202410 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 2024
 

Adopting Actors: An epic tail of loss and learning

Notes de l'éditeur

  1. Principle Engineer – Workday’s Grid Cloud Master team. – Who is workday
  2. Finance and Human Capital Management – ERP Vendor – 100% in the cloud – all customers on a single version
  3. Fiscal 2016 Total Revenue of $1.16 billion, up 48% year over year Over 5000 employees, over 500 employees in Dublin 2016: Best Workplaces in Ireland, Great Place to Work Institute (#2 for large companies) 2016: 10 Best Large Workplaces in Tech, Fortune (#2)
  4. provide elastic grid – other services Reliable execution of background tasks or Jobs – pdf printing to payrole Cloudmaster - Agents - Schedule and assign to Agents
  5. 5 pools of agents Different types of task, memory size, execution speed
  6. 5 data centers Secure Reliable Safe Isolated – fairness Scalable - Efficient
  7. This talk is about the lessons I learned migrating a multithreaded java server application to Akka. To support this growth we need to move to stateful services -- Why
  8. Actor model of concurrency: Safer (no deadlocks) Easier to reason about Easier to test Better distribution Easier scalability Then Scala because of akka – key selling point
  9. Trying to avoid two way relationship (coupling – mutability) Static State should be immutable
  10. Trying to avoid two way relationship (coupling – mutability) Static State should be immutable
  11. Trying to avoid two way relationship (coupling – mutability) Static State should be immutable
  12. Everyone knows about the God class – threading and mutexes make this worse
  13. Some are big - Marlon Brando – some are small Robert Downey Junior - me Even when small - entourage
  14. AgentPoolActor - Responsible for – Agent actors – Queue of tasks – and their assignments Decomposed into separate classes and traits - Still one actor with an entourage
  15. Also drives more bad decisions
  16. AgentPoolActor and AgentStateActor External DB changes – sending notifications – message loss – recovery Caused by movie star – Thought problem was stream of events were inconsistent – fix that State Inconsistent – failure – production outage
  17. … Beauty of split brains
  18. AgentPoolActor takes job from the Queue Assigns it to an Agent Agent might fail and put it back Pool or Agent might own the job - Cannot reliably find the job EG Cancel Job
  19. Who - When
  20. PoolActor has decided to assign task to an agent Async message to StateActor – PoolActor must ensure agent not reused – before reply What if reply timesout??? Crash - Can I guarantee consistency – what happens to the job?
  21. Chaos Marmoset base actor overrides the unhandled method Messages can cause failures or delays
  22. Horizontal scalability by pushing all state into the database Actors are about data – Actors are Stateful – Impedance Stateless services cannot update the same data as actor
  23. Autonomy – single responsibility If your actors write to the database
  24. We want agent assignments to be consistent
  25. Banking Transactions ACID? No - Suspense Account – Reconciliation – Compensating transactions Must handle failure cases