3. A man, a plan …
Akka recap, distsys background and Akka Cluster basics
We’ll get an overview of how Akka does clustering
Cluster Sharding
Shard actors across cluster
Distributed Data
Eventual consistency
Distributed PubSub
A message bus across the cluster
Cluster Singleton
How to introduce a single point of failure
Akka Streams
Async, backpressured streams
}Cluster Tools
4. Eventual consistency
Distributed PubSub
A message bus across the cluster
Cluster Singleton
How to introduce a single point of failure
Akka Streams
Async, backpressured streams
Alpakka
Enterprise integration on top of streams
Akka Persistence
Eventsourcing with actors
Akka HTTP
Async streaming HTTP server and client
}Cluster Tools
8. Distributed Systems
–Leslie Lamport
”A distributed system is one in which the failure of a
computer you didn't even know existed can render
your own computer unusable”
9. Why is it so hard?
Reliability: power failure, old network
equipment, network congestion, coffee in router,
rodents, that guy in the IT dept., DDOS attacks…
Latency: loopback vs local net vs shared congested local net vs
internet
Bandwidth: again loopback vs local vs shared local vs internet
The Joys of Computer Networks:
10. Why do it, if it is so hard?
Data or processing doesn’t fit a single machine
Many objects, that should be kept in memory. Many not so
powerful servers can be cheaper than a supercomputer.
Elasticity
Being able to scale in (less servers) and out (more servers)
depending on load. Not paying for servers unless you need them.
Resilience
Building systems that will keep working in the face of failures or
degrade gracefully.
11. Actor Model vs Network
Interaction already modelled as immutable messages
Data travels over the network in packages, changes has to be
explicitly sent back.
At most once
Data reaches a node on the other side at most once, but can be
lost, already part of model!
A recipient of a message can reply directly to sender
Regardless if there were intermediate recipients of the message
Messages not limited to request response
Messages can flow in either direction when two systems are
connected.
27. Seed nodes
!First seed node
if none of the other nodes in the list are in
the cluster - joins itself to form cluster
Rest of seed nodes
just pings all other nodes and joins as soon
as one is in the cluster responds
Join
28. What would happen if we mess it up?
I’m the leader, this is the cluster!
No! I’m the leader, this is the cluster!
Join
Join
No! I’m the leader, this is the cluster!
29. User API of Cluster
Node details
What roles am I in, what is my address
Join, Leave, Down
Programatic control over cluster membership
Register listeners for cluster events
Every time the cluster state changes the listening
actor will get a message
30. Cluster through config
akka {
actor.provider = cluster
remote.artery {
enabled = true
transport = tcp
canonical.hostname = 192.168.0.1
canonical.port = 25520
}
cluster.seed-nodes = [
“akka://cluster@192.168.0.1:25520",
“akka://cluster@192.168.0.2:25520”
]
}
application.conf:
final ActorSystem system =
ActorSystem.create("cluster");
31. Programmatic join
final ActorSystem node1 = ActorSystem.create("cluster");
final ActorSystem node2 = ActorSystem.create("cluster");
final ActorSystem node3 = ActorSystem.create("cluster");
// joins itself to form cluster
final Cluster node1Cluster = Cluster.get(node1);
node1Cluster.join(node1Cluster.selfAddress());
// joins the cluster through the one node in the cluster
final Cluster node2Cluster = Cluster.get(node2);
node2Cluster.join(node1Cluster.selfAddress());
// subsequent nodes can join through any node that is already in the cluster
final Cluster node3Cluster = Cluster.get(node3);
node3Cluster.join(node2Cluster.selfAddress());
39. Distributed Data
CRDTs: Conflict free Replicated Data Types
allow for updates on any node and then
spreading that update to other cluster nodes
through gossip for eventual consistency
Note: Does not fit every problem!
Online docs for Distributed Data
40. Special requirements
Commutative
Order of operation does not matter
like 3 + 4 = 4 + 3
Associative
Grouping operations does not matter
like 3 + (4 + 5) = (3 + 4) + 5
Monotonic
Absence of rollbacks, ”only growing” (but we can do sneaky
tricks)
41. Built in data structures
Counters
GCounter - grow only, PNCounter - increment and decrement counter
Sets
GSet - grow only, ORSet - observed remove set
Maps
ORMap - observed remove map, ORMultiMap - observed remove multi
map, PNCounterMap - positive negative counter map, LWWMap - last
writer wins map
Flags and Register
Flag - toggle once boolean, LWWRegister - last writer wins register
60. Multi-DC support in cluster tools
Sharding
Per DC, proxy defaults to local dc, can be started with a data-
center setting
Distributed Data and PubSub
No special support (optimisation possibilities)
Cluster Singleton
Is a singleton per DC
Multi-DC Persistence
Commercial feature allowing for active-active or active-
passive eventually consistent persistence between
datacenters (docs)