11. Managing Service Dependencies
Distributed architectures can have dozens of dependencies
Each can fail independently
Even 0.01% downtime on each of dozens of services
equates to potentially hours a month of downtime if not
engineered for resilience
Service A
Service B
Service C
request
Dependency
Dependency
15. Hystrix to the Rescue
Wrap calls to external systems in a
dependency command object, run in
separate thread
Timeout calls after time ~ >99.5th of all
latencies
Control #threads w/ a pool or semaphore
Measure success, trip circuit if needed
Perform fallback logic
Monitor metrics and change in real time
Hystrix Wiki
16. Taming Tail Latencies of Service Calls
Real time metrics show problems as they occur
Trends help configure timeouts
Set timeouts based on histogram data
99.5th + buffer is a good start
Tier timeouts for retrials on other servers (e.g., 5, 15, 30)
18. Processing a Data/Event Stream
Iterator<T> iterator = dataStream.iterator();
while(iterator.hasNext()) { process(iterator.next()); }
What if dataStream represents an unbounded stream?
What if data comes over the network? With latencies, failures.
What if data comes from multiple sources?
How would you manage concurrency? Threads? Semaphores?
RxJava implementation of reactive extensions addresses these questions
“...provides a collection of operators with which you can filter, select,
transform, combine, and compose Observables. This allows for efficient
execution and composition…”
19. RxJava
Java impl for Reactive Extensions
A library for composing asynchronous and event-based programs by using
observable sequences
Extends the observer pattern to support sequences of data/events and adds
operators that allow you to compose sequences together declaratively while
abstracting away concerns about things like low-level threading,
synchronization, thread-safety, concurrent data structures, and non-blocking
I/O
Event Iterable (pull) Observable (push)
retrieve data T next() onNext(T)
discover error throws Exception onError(Exception)
complete returns onComplete()
21. Example Code: Iterable and Observable
getDataFromLocalMemory()
.skip(10)
.take(5)
.map({ s -> return s + "
transformed" })
.forEach({ println "next
=> " + it })
getDataFromNetwork()
.skip(10)
.take(5)
.map({ s -> return s + "
transformed" })
.subscribe({ println
"onNext => " + it })
Data can be pushed from multiple sources
No need to block for result availability
RxJava is a tool to react to push data
Java Futures as an alternative are non-trivial with nested async execution
23. Async IO with Netty, RxNetty
Netty is an NIO client server framework
(see Java IO Vs. NIO)
Supports non-blocking IO
High throughput, low latency, less resource consumption
RxNetty is Reactive Extensions adaptor for Netty
When using something like Netty,
Total #threads in app = Total #cores in the system
24. RxNetty Server Example
public static void main(final String[] args) {
final int port = 8080;
RxNetty.createHttpServer(port, new RequestHandler<ByteBuf, ByteBuf>() {
@Override
public Observable<Void> handle(HttpServerRequest<ByteBuf> request, final HttpServerResponse<ByteBuf> response) {
System.out.println("New request recieved");
System.out.println(request.getHttpMethod() + " " + request.getUri() + ' ' + request.getHttpVersion());
for (Map.Entry<String, String> header : request.getHeaders().entries()) {
System.out.println(header.getKey() + ": " + header.getValue());
}
<continued…>
25. RxNetty Server Example (Cntd.)
return request.getContent().materialize()
.flatMap(new Func1<Notification<ByteBuf>, Observable<Void>>() {
@Override
public Observable<Void> call(Notification<ByteBuf> notification) {
if (notification.isOnCompleted()) {
return response.writeStringAndFlush("Welcome!!!");
} else if (notification.isOnError()) {
return Observable.error(notification.getThrowable());
} else {
ByteBuf next = notification.getValue();
System.out.println(next.toString(Charset.defaultCharset()));
return Observable.empty();
}
}
});
}
}).startAndWait();
}
27. Mesos Cluster Manager
Resource allocation across distributed applications (aka
Frameworks) on shared pool of nodes.
Akin to Google Borg
Plugable isolation for CPU, I/O, etc. via Linux CGroups,
Docker, etc.
Fault tolerant leader election via ZooKeeper
Used at Twitter, AirBnB, etc.