Reactive applications are becoming a de-facto industry standard and, if employed correctly, toolkits like Lightbend Reactive Platform make the implementation easier than ever. But the design of these systems might be challenging as it requires particular mindset shift to tackle problems we might not be used to. In this talk, we’re going to discuss the most common things I’ve seen in the field that prevented applications to work as expected. I’d like to talk about typical pitfalls that might cause troubles, about trade-offs that might not be fully understood or important choices that might be overlooked including persistent actors pitfalls, tackling of network partitions, proper implementations of graceful shutdown or distributed transactions, trade-offs of micro-services or actors and more. This talk should be interesting for anyone who is thinking about, implementing, or have already deployed a reactive application. My goal is to provide is to provide a comprehensive explanation of common problems to be sure they won’t be repeated by fellow developers. The talk is a little bit more focused on Lightbend platform but the understanding of the concepts we are going to talk about should be beneficial for everyone interested in this field.
5. Pick the Right Tool for The Job
Akka
ACTORS
Power
Constraints
Akka
Stream
6. Pick the Right Tool for The Job
Akka
ACTORS
Power
Constraints
Akka
TYPED
7. Pick the Right Tool for The Job
Akka
TYPED
Akka
ACTORS
Power
Constraints
Akka
Stream
8. Pick the Right Tool for The Job
Local Abstractions Distribution
Akka
TYPED
Akka
ACTORS
Power
Constraints
Akka
Stream
9. Actor Use Cases
● State management
● Location transparency
● Resilience mechanisms
● Single writer
● In-memory lock-free cache
● Sharding
Akka
ACTOR
10. Future Use Cases
● Local Concurrency
● Simplicity
● Composition
● Typesafety
11. Avoid Java Serialization
Java Serialization is the default in Akka, since
it is easy to start with it, but is very slow and
footprint heavy
17. Points of Interest
● Performance
● Footprint
● Schema evolution
● Implementation effort
● Human readability
● Language bindings
● Backwards & forwards compatibility
● ...
18. JSON
● Advantages:
○ Human readability
○ Simple & well known
○ Many good libraries
for all platforms
● Disadvantages:
○ Slow
○ Large
○ Object names included
○ No schema (except e.g. json
schema)
○ Format and precision issues
● json4s, circe, µPickle, spray-json, argonaut, rapture-json, play-json, …
19. Binary formats [Schema-less]
● Metadata send together with data
● Advantages:
○ Implementation effort
○ Performance
○ Footprint *
● Disadvantages:
○ No human readability
● Kryo, Binary JSON (MessagePack, BSON, ... )
20. Binary formats [Schema]
● Schema defined by some kind of DSL
● Advantages:
○ Performance
○ Footprint
○ Schema evolution
● Disadvantages:
○ Implementation effort
○ No human readability
● Protobuf (+ projects like Flatbuffers, Cap’n Proto, etc.), Thrift, Avro
21. Summary
● Should be always changed
● Depends on particular use case
● Quick tips:
○ json4s
○ kryo
○ protobuf
22. Graceful Shutdown
We have thousands of sharded actors on
multiple nodes and we want to shut one of
them down
26. High-level Procedure
1. JVM gets the shutdown signal
2. Coordinator tells all local ShardRegions to shut down gracefully
27. High-level Procedure
1. JVM gets the shutdown signal
2. Coordinator tells all local ShardRegions to shut down gracefully
3. Node leaves cluster
28. High-level Procedure
1. JVM gets the shutdown signal
2. Coordinator tells all local ShardRegions to shut down gracefully
3. Node leaves cluster
4. Coordinator gives singletons a grace period to migrate
29. High-level Procedure
1. JVM gets the shutdown signal
2. Coordinator tells all local ShardRegions to shut down gracefully
3. Node leaves cluster
4. Coordinator gives singletons a grace period to migrate
5. Actor System & JVM Termination
30. Integration with Sharded Actors
● Handling of added messages
○ Passivate() message for graceful stop
○ Context.stop() for immediate stop
● Priority mailbox
○ Priority message handling
○ Message retrying support
31. Summary
● We don’t want to lose data (usually)
● Shutdown coordinator on every node
● Integration with sharded actors
32. Distributed Transactions
Any situation where a single event results in
the mutation of two separate sources of data
which cannot be committed atomically
33. What’s Wrong With Them
● Simple happy paths
● 7 Fallacies of Distributed Programming
○ The network is reliable.
○ Latency is zero.
○ Bandwidth is infinite.
○ The network is secure.
○ Topology doesn't change.
○ There is one administrator.
○ Transport cost is zero.
○ The network is homogeneous.
34. Two-phase commit (2PC)
Stage 1 - Prepare Stage 2 - Commit
Prepare
Prepared
Prepare
Prepared
Com
m
it
Com
m
itted
Commit
Committed
Resource
Manager
Resource
Manager
Transaction
Manager
Resource
Manager
Resource
Manager
Transaction
Manager
36. The Big Trade-Off
● Distributed transactions can be usually avoided
○ Hard, expensive, fragile and do not scale
● Every business event needs to result in a single synchronous commit
● Other data sources should be updated asynchronously
● Introducing eventual consistency
37. Longtail Latencies
Consider a system where each service
typically responds in 10ms but with a 99th
percentile latency of one second