SlideShare une entreprise Scribd logo
1  sur  58
Télécharger pour lire hors ligne
Fault tolerant 
microservices 
BSkyB 
@chbatey
@chbatey 
Who is this guy? 
● Enthusiastic nerd 
● Senior software engineer at BSkyB 
● Builds a lot of distributed applications 
● Apache Cassandra MVP
@chbatey 
Agenda 
1. Setting the scene 
○ What do we mean by a fault? 
○ What is a microservice? 
○ Monolith application vs the micro(ish) service 
2. A worked example 
○ Identify an issue 
○ Reproduce/test it 
○ Show how to deal with the issue
So… what do applications look like? 
@chbatey
So... what do systems look like now? 
@chbatey
But different things go wrong... 
@chbatey 
down 
slow network 
slow app 
2 second max 
GC :( 
missing packets
Fault tolerance 
1. Don’t take forever - Timeouts 
2. Don’t try if you can’t succeed 
3. Fail gracefully 
4. Know if it’s your fault 
5. Don’t whack a dead horse 
6. Turn broken stuff off 
@chbatey
Time for an example... 
● All examples are on github 
● Technologies used: 
@chbatey 
○ Dropwizard 
○ Spring Boot 
○ Wiremock 
○ Hystrix 
○ Graphite 
○ Saboteur
Example: Movie player service 
@chbatey 
Shiny App 
User 
Service 
Device 
Service 
Pin 
Service 
Shiny App 
Shiny App 
Shiny App 
User 
Se rUvisceer 
Service 
Device 
Service 
Play Movie
Testing microservices 
You don’t know a service is 
fault tolerant if you don’t 
test faults 
@chbatey
Isolated service tests 
Shiny App 
@chbatey 
Mocks 
User 
Device 
Pin 
service 
Acceptance Play Movie 
Test 
Prime
1 - Don’t take forever 
@chbatey 
● If at first you don’t 
succeed, don’t take 
forever to tell someone 
● Timeout and fail fast
Which timeouts? 
● Socket connection timeout 
● Socket read timeout 
@chbatey
Your service hung for 30 seconds :( 
@chbatey 
Customer 
You :(
Which timeouts? 
● Socket connection timeout 
● Socket read timeout 
● Resource acquisition 
@chbatey
Your service hung for 10 minutes :( 
@chbatey
Let’s think about this 
@chbatey
A little more detail 
@chbatey
Wiremock + Saboteur + Vagrant 
● Vagrant - launches + provisions local VMs 
● Saboteur - uses tc, iptables to simulate 
@chbatey 
network issues 
● Wiremock - used to mock HTTP 
dependencies 
● Cucumber - acceptance tests
I can write an automated test for that? 
@chbatey 
Vagrant + Virtual box VM 
Wiremock 
User Service 
Device Service 
Pin Service 
Sabot 
eur 
Play 
Movie 
Service 
Acceptance 
Test 
prime to drop traffic 
reset
Implementing reliable timeouts 
● Homemade: Worker Queue + Thread pool 
@chbatey 
(executor)
Implementing reliable timeouts 
● Homemade: Worker Queue + Thread pool 
@chbatey 
(executor) 
● Hystrix
Implementing reliable timeouts 
● Homemade: Worker Queue + Thread pool 
@chbatey 
(executor) 
● Hystrix 
● Spring Cloud Netflix
A simple Spring RestController 
@chbatey 
@RestController 
public class Resource { 
private static final Logger LOGGER = LoggerFactory.getLogger(Resource.class); 
@Autowired 
private ScaryDependency scaryDependency; 
@RequestMapping("/scary") 
public String callTheScaryDependency() { 
LOGGER.info("RestContoller: I wonder which thread I am on!"); 
return scaryDependency.getScaryString(); 
} 
}
Scary dependency 
@chbatey 
@Component 
public class ScaryDependency { 
private static final Logger LOGGER = LoggerFactory.getLogger(ScaryDependency.class); 
public String getScaryString() { 
LOGGER.info("Scary dependency: I wonder which thread I am on!"); 
if (System.currentTimeMillis() % 2 == 0) { 
return "Scary String"; 
} else { 
Thread.sleep(10000); 
return "Really slow scary string"; } 
} 
}
All on the tomcat thread 
13:07:32.814 [http-nio-8080-exec-1] INFO info.batey. 
examples.Resource - RestContoller: I wonder which thread 
I am on! 
13:07:32.896 [http-nio-8080-exec-1] INFO info.batey. 
examples.ScaryDependency - Scary dependency: I wonder 
which thread I am on! 
@chbatey
Seriously this simple now? 
@chbatey 
@Component 
public class ScaryDependency { 
private static final Logger LOGGER = LoggerFactory.getLogger(ScaryDependency.class); 
@HystrixCommand 
public String getScaryString() { 
LOGGER.info("Scary dependency: I wonder which thread I am on!"); 
if (System.currentTimeMillis() % 2 == 0) { 
return "Scary String"; 
} else { 
Thread.sleep(10000); 
return "Really slow scary string"; 
} 
} 
}
What an annotation can do... 
13:07:32.814 [http-nio-8080-exec-1] INFO info.batey. 
examples.Resource - RestController: I wonder which 
thread I am on! 
13:07:32.896 [hystrix-ScaryDependency-1] INFO info. 
batey.examples.ScaryDependency - Scary Dependency: I 
wonder which thread I am on! 
@chbatey
Timeouts take home 
● You can’t use network level timeouts for 
@chbatey 
SLAs 
● Test your SLAs - if someone says you can’t, 
hit them with a stick 
● Scary things happen without network issues
2 - Don’t try if you can’t succeed 
@chbatey
Complexity 
● When an application grows in complexity it 
will eventually start sending emails 
@chbatey
Complexity 
● When an application grows in complexity it 
will eventually start sending emails contain 
queues and thread pools 
@chbatey
Don’t try if you can’t succeed 
● Executor Unbounded queues :( 
○ newFixedThreadPool 
○ newSingleThreadExecutor 
○ newThreadCachedThreadPool 
● Bound your queues and threads 
● Fail quickly when the queue / 
@chbatey 
maxPoolSize is met 
● Know your drivers
This is a functional requirement 
● Set the timeout very high 
● Use wiremock to add a large delay to the 
@chbatey 
requests 
● Set queue size and thread pool size to 1 
● Send in 2 requests to use the thread and fill 
the queue 
● What happens on the 3rd request?
3 - Fail gracefully 
@chbatey
Expect rubbish 
● Expect invalid HTTP 
● Expect malformed response bodies 
● Expect connection failures 
● Expect huge / tiny responses 
@chbatey
Testing with Wiremock 
@chbatey 
stubFor(get(urlEqualTo("/dependencyPath")) 
.willReturn(aResponse() 
.withFault(Fault.MALFORMED_RESPONSE_CHUNK))); 
{ 
"request": { 
"method": "GET", 
"url": "/fault" 
}, 
"response": { 
"fault": "RANDOM_DATA_THEN_CLOSE" 
} 
} 
{ 
"request": { 
"method": "GET", 
"url": "/fault" 
}, 
"response": { 
"fault": "EMPTY_RESPONSE" 
} 
}
4 - Know if it’s your fault 
@chbatey
What to record 
● Metrics: Timings, errors, concurrent 
incoming requests, thread pool statistics, 
connection pool statistics 
● Logging: Boundary logging, elasticsearch / 
@chbatey 
logstash 
● Request identifiers
Graphite + Codahale 
@chbatey
@chbatey 
Response times
Separate resource pools 
● Don’t flood your dependencies 
● Be able to answer the questions: 
○ How many connections will 
you make to dependency X? 
○ Are you getting close to your 
@chbatey 
max connections?
So easy with Dropwizard + Hystrix 
@Override 
public void initialize(Bootstrap<AppConfig> appConfigBootstrap) { 
HystrixCodaHaleMetricsPublisher metricsPublisher 
= new HystrixCodaHaleMetricsPublisher(appConfigBootstrap.getMetricRegistry()) 
HystrixPlugins.getInstance().registerMetricsPublisher(metricsPublisher); 
@chbatey 
} 
metrics: 
reporters: 
- type: graphite 
host: 192.168.10.120 
port: 2003 
prefix: shiny_app
5 - Don’t whack a dead horse 
@chbatey 
Shiny App 
User 
Service 
Device 
Service 
Pin 
Service 
Shiny App 
Shiny App 
Shiny App 
User 
Se rUvisceer 
Service 
Device 
Service 
Play Movie
What to do.. 
● Yes this will happen.. 
● Mandatory dependency - fail *really* fast 
● Throttling 
● Fallbacks 
@chbatey
Circuit breaker pattern 
@chbatey
Implementation with Hystrix 
@chbatey 
@GET 
@Timed 
public String integrate() { 
LOGGER.info("I best do some integration!"); 
String user = new UserServiceDependency(userService).execute(); 
String device = new DeviceServiceDependency(deviceService).execute(); 
Boolean pinCheck = new PinCheckDependency(pinService).execute(); 
return String.format("[User info: %s] n[Device info: %s] n[Pin check: %s] n", user, device, 
pinCheck); 
}
Implementation with Hystrix 
public class PinCheckDependency extends HystrixCommand<Boolean> { 
@chbatey 
@Override 
protected Boolean run() throws Exception { 
HttpGet pinCheck = new HttpGet("http://localhost:9090/pincheck"); 
HttpResponse pinCheckResponse = httpClient.execute(pinCheck); 
String pinCheckInfo = EntityUtils.toString(pinCheckResponse.getEntity()); 
return Boolean.valueOf(pinCheckInfo); 
} 
}
Implementation with Hystrix 
public class PinCheckDependency extends HystrixCommand<Boolean> { 
@chbatey 
@Override 
protected Boolean run() throws Exception { 
HttpGet pinCheck = new HttpGet("http://localhost:9090/pincheck"); 
HttpResponse pinCheckResponse = httpClient.execute(pinCheck); 
String pinCheckInfo = EntityUtils.toString(pinCheckResponse.getEntity()); 
return Boolean.valueOf(pinCheckInfo); 
} 
@Override 
public Boolean getFallback() { 
return true; 
} 
}
Triggering the fallback 
● Error threshold percentage 
● Bucket of time for the percentage 
● Minimum number of requests to trigger 
● Time before trying a request again 
● Disable 
● Per instance statistics 
@chbatey
6 - Turn off broken stuff 
● The kill switch 
@chbatey
To recap 
1. Don’t take forever - Timeouts 
2. Don’t try if you can’t succeed 
3. Fail gracefully 
4. Know if it’s your fault 
5. Don’t whack a dead horse 
6. Turn broken stuff off 
@chbatey
@chbatey 
Links 
● Examples: 
○ https://github.com/chbatey/spring-cloud-example 
○ https://github.com/chbatey/dropwizard-hystrix 
○ https://github.com/chbatey/vagrant-wiremock-saboteur 
● Tech: 
○ https://github.com/Netflix/Hystrix 
○ https://www.vagrantup.com/ 
○ http://wiremock.org/ 
○ https://github.com/tomakehurst/saboteur
Questions? 
● Thanks for listening! 
● http://christopher-batey.blogspot.co.uk/ 
@chbatey
Developer takeaways 
● Learn about TCP 
● Love vagrant, docker etc to enable testing 
● Don’t trust libraries 
@chbatey
Hystrix cost - do this yourself 
@chbatey
Hystrix metrics 
● Failure count 
● Percentiles from Hystrix 
@chbatey 
point of view 
● Error percentages
How to test metric publishing? 
● Stub out graphite and verify calls? 
● Programmatically call graphite and verify 
@chbatey 
numbers? 
● Make metrics + logs part of the story demo

Contenu connexe

Tendances

DataStax: Extreme Cassandra Optimization: The Sequel
DataStax: Extreme Cassandra Optimization: The SequelDataStax: Extreme Cassandra Optimization: The Sequel
DataStax: Extreme Cassandra Optimization: The SequelDataStax Academy
 
Understanding Reactive Programming
Understanding Reactive ProgrammingUnderstanding Reactive Programming
Understanding Reactive ProgrammingAndres Almiray
 
Communication in a Microservice Architecture
Communication in a Microservice ArchitectureCommunication in a Microservice Architecture
Communication in a Microservice ArchitecturePer Bernhardt
 
SQL Transactions - What they are good for and how they work
SQL Transactions - What they are good for and how they workSQL Transactions - What they are good for and how they work
SQL Transactions - What they are good for and how they workMarkus Winand
 
Apache kafka 관리와 모니터링
Apache kafka 관리와 모니터링Apache kafka 관리와 모니터링
Apache kafka 관리와 모니터링JANGWONSEO4
 
Build an Event-driven Microservices with Apache Kafka & Apache Flink with Ali...
Build an Event-driven Microservices with Apache Kafka & Apache Flink with Ali...Build an Event-driven Microservices with Apache Kafka & Apache Flink with Ali...
Build an Event-driven Microservices with Apache Kafka & Apache Flink with Ali...HostedbyConfluent
 
HBase replication
HBase replicationHBase replication
HBase replicationwchevreuil
 
Real-time Stream Processing with Apache Flink
Real-time Stream Processing with Apache FlinkReal-time Stream Processing with Apache Flink
Real-time Stream Processing with Apache FlinkDataWorks Summit
 
Analyze corefile and backtraces with GDB for Mysql/MariaDB on Linux - Nilanda...
Analyze corefile and backtraces with GDB for Mysql/MariaDB on Linux - Nilanda...Analyze corefile and backtraces with GDB for Mysql/MariaDB on Linux - Nilanda...
Analyze corefile and backtraces with GDB for Mysql/MariaDB on Linux - Nilanda...Mydbops
 
Introduction to Apache Kafka
Introduction to Apache KafkaIntroduction to Apache Kafka
Introduction to Apache KafkaJeff Holoman
 
Enabling product personalisation using Apache Kafka, Apache Pinot and Trino w...
Enabling product personalisation using Apache Kafka, Apache Pinot and Trino w...Enabling product personalisation using Apache Kafka, Apache Pinot and Trino w...
Enabling product personalisation using Apache Kafka, Apache Pinot and Trino w...HostedbyConfluent
 
Kafka Tutorial - Introduction to Apache Kafka (Part 1)
Kafka Tutorial - Introduction to Apache Kafka (Part 1)Kafka Tutorial - Introduction to Apache Kafka (Part 1)
Kafka Tutorial - Introduction to Apache Kafka (Part 1)Jean-Paul Azar
 
Event Streaming with Kafka Streams and Spring Cloud Stream | Soby Chacko, VMware
Event Streaming with Kafka Streams and Spring Cloud Stream | Soby Chacko, VMwareEvent Streaming with Kafka Streams and Spring Cloud Stream | Soby Chacko, VMware
Event Streaming with Kafka Streams and Spring Cloud Stream | Soby Chacko, VMwareHostedbyConfluent
 
Mainframe Integration, Offloading and Replacement with Apache Kafka | Kai Wae...
Mainframe Integration, Offloading and Replacement with Apache Kafka | Kai Wae...Mainframe Integration, Offloading and Replacement with Apache Kafka | Kai Wae...
Mainframe Integration, Offloading and Replacement with Apache Kafka | Kai Wae...HostedbyConfluent
 
Building a Scalable Inbox System with MongoDB and Java
Building a Scalable Inbox System with MongoDB and JavaBuilding a Scalable Inbox System with MongoDB and Java
Building a Scalable Inbox System with MongoDB and Javaantoinegirbal
 
Apache Kafka Architecture & Fundamentals Explained
Apache Kafka Architecture & Fundamentals ExplainedApache Kafka Architecture & Fundamentals Explained
Apache Kafka Architecture & Fundamentals Explainedconfluent
 
Spring Native and Spring AOT
Spring Native and Spring AOTSpring Native and Spring AOT
Spring Native and Spring AOTVMware Tanzu
 
Deep Learning at Extreme Scale (in the Cloud) 
with the Apache Kafka Open Sou...
Deep Learning at Extreme Scale (in the Cloud) 
with the Apache Kafka Open Sou...Deep Learning at Extreme Scale (in the Cloud) 
with the Apache Kafka Open Sou...
Deep Learning at Extreme Scale (in the Cloud) 
with the Apache Kafka Open Sou...Kai Wähner
 

Tendances (20)

DataStax: Extreme Cassandra Optimization: The Sequel
DataStax: Extreme Cassandra Optimization: The SequelDataStax: Extreme Cassandra Optimization: The Sequel
DataStax: Extreme Cassandra Optimization: The Sequel
 
Understanding Reactive Programming
Understanding Reactive ProgrammingUnderstanding Reactive Programming
Understanding Reactive Programming
 
Communication in a Microservice Architecture
Communication in a Microservice ArchitectureCommunication in a Microservice Architecture
Communication in a Microservice Architecture
 
SQL Transactions - What they are good for and how they work
SQL Transactions - What they are good for and how they workSQL Transactions - What they are good for and how they work
SQL Transactions - What they are good for and how they work
 
Apache kafka 관리와 모니터링
Apache kafka 관리와 모니터링Apache kafka 관리와 모니터링
Apache kafka 관리와 모니터링
 
Build an Event-driven Microservices with Apache Kafka & Apache Flink with Ali...
Build an Event-driven Microservices with Apache Kafka & Apache Flink with Ali...Build an Event-driven Microservices with Apache Kafka & Apache Flink with Ali...
Build an Event-driven Microservices with Apache Kafka & Apache Flink with Ali...
 
HBase replication
HBase replicationHBase replication
HBase replication
 
Real-time Stream Processing with Apache Flink
Real-time Stream Processing with Apache FlinkReal-time Stream Processing with Apache Flink
Real-time Stream Processing with Apache Flink
 
Analyze corefile and backtraces with GDB for Mysql/MariaDB on Linux - Nilanda...
Analyze corefile and backtraces with GDB for Mysql/MariaDB on Linux - Nilanda...Analyze corefile and backtraces with GDB for Mysql/MariaDB on Linux - Nilanda...
Analyze corefile and backtraces with GDB for Mysql/MariaDB on Linux - Nilanda...
 
Blockchain on aws
Blockchain on awsBlockchain on aws
Blockchain on aws
 
Introduction to Apache Kafka
Introduction to Apache KafkaIntroduction to Apache Kafka
Introduction to Apache Kafka
 
Enabling product personalisation using Apache Kafka, Apache Pinot and Trino w...
Enabling product personalisation using Apache Kafka, Apache Pinot and Trino w...Enabling product personalisation using Apache Kafka, Apache Pinot and Trino w...
Enabling product personalisation using Apache Kafka, Apache Pinot and Trino w...
 
Kafka Tutorial - Introduction to Apache Kafka (Part 1)
Kafka Tutorial - Introduction to Apache Kafka (Part 1)Kafka Tutorial - Introduction to Apache Kafka (Part 1)
Kafka Tutorial - Introduction to Apache Kafka (Part 1)
 
Event Streaming with Kafka Streams and Spring Cloud Stream | Soby Chacko, VMware
Event Streaming with Kafka Streams and Spring Cloud Stream | Soby Chacko, VMwareEvent Streaming with Kafka Streams and Spring Cloud Stream | Soby Chacko, VMware
Event Streaming with Kafka Streams and Spring Cloud Stream | Soby Chacko, VMware
 
Mainframe Integration, Offloading and Replacement with Apache Kafka | Kai Wae...
Mainframe Integration, Offloading and Replacement with Apache Kafka | Kai Wae...Mainframe Integration, Offloading and Replacement with Apache Kafka | Kai Wae...
Mainframe Integration, Offloading and Replacement with Apache Kafka | Kai Wae...
 
Introduction to Microservices
Introduction to MicroservicesIntroduction to Microservices
Introduction to Microservices
 
Building a Scalable Inbox System with MongoDB and Java
Building a Scalable Inbox System with MongoDB and JavaBuilding a Scalable Inbox System with MongoDB and Java
Building a Scalable Inbox System with MongoDB and Java
 
Apache Kafka Architecture & Fundamentals Explained
Apache Kafka Architecture & Fundamentals ExplainedApache Kafka Architecture & Fundamentals Explained
Apache Kafka Architecture & Fundamentals Explained
 
Spring Native and Spring AOT
Spring Native and Spring AOTSpring Native and Spring AOT
Spring Native and Spring AOT
 
Deep Learning at Extreme Scale (in the Cloud) 
with the Apache Kafka Open Sou...
Deep Learning at Extreme Scale (in the Cloud) 
with the Apache Kafka Open Sou...Deep Learning at Extreme Scale (in the Cloud) 
with the Apache Kafka Open Sou...
Deep Learning at Extreme Scale (in the Cloud) 
with the Apache Kafka Open Sou...
 

En vedette

Dropwizard Internals
Dropwizard InternalsDropwizard Internals
Dropwizard Internalscarlo-rtr
 
Production Ready Web Services with Dropwizard
Production Ready Web Services with DropwizardProduction Ready Web Services with Dropwizard
Production Ready Web Services with Dropwizardsullis
 
Simple REST-APIs with Dropwizard and Swagger
Simple REST-APIs with Dropwizard and SwaggerSimple REST-APIs with Dropwizard and Swagger
Simple REST-APIs with Dropwizard and SwaggerLeanIX GmbH
 
Stick to the rules - Consumer Driven Contracts. 2015.07 Confitura
Stick to the rules - Consumer Driven Contracts. 2015.07 ConfituraStick to the rules - Consumer Driven Contracts. 2015.07 Confitura
Stick to the rules - Consumer Driven Contracts. 2015.07 ConfituraMarcin Grzejszczak
 
Patterns for building resilient and scalable microservices platform on AWS
Patterns for building resilient and scalable microservices platform on AWSPatterns for building resilient and scalable microservices platform on AWS
Patterns for building resilient and scalable microservices platform on AWSBoyan Dimitrov
 

En vedette (7)

Dropwizard Internals
Dropwizard InternalsDropwizard Internals
Dropwizard Internals
 
Production Ready Web Services with Dropwizard
Production Ready Web Services with DropwizardProduction Ready Web Services with Dropwizard
Production Ready Web Services with Dropwizard
 
Simple REST-APIs with Dropwizard and Swagger
Simple REST-APIs with Dropwizard and SwaggerSimple REST-APIs with Dropwizard and Swagger
Simple REST-APIs with Dropwizard and Swagger
 
Stick to the rules - Consumer Driven Contracts. 2015.07 Confitura
Stick to the rules - Consumer Driven Contracts. 2015.07 ConfituraStick to the rules - Consumer Driven Contracts. 2015.07 Confitura
Stick to the rules - Consumer Driven Contracts. 2015.07 Confitura
 
Dropwizard
DropwizardDropwizard
Dropwizard
 
Reactive Design Patterns
Reactive Design PatternsReactive Design Patterns
Reactive Design Patterns
 
Patterns for building resilient and scalable microservices platform on AWS
Patterns for building resilient and scalable microservices platform on AWSPatterns for building resilient and scalable microservices platform on AWS
Patterns for building resilient and scalable microservices platform on AWS
 

Similaire à FaultTolerantMicroservicesBSkyb

Voxxed Vienna 2015 Fault tolerant microservices
Voxxed Vienna 2015 Fault tolerant microservicesVoxxed Vienna 2015 Fault tolerant microservices
Voxxed Vienna 2015 Fault tolerant microservicesChristopher Batey
 
LJC: Microservices in the real world
LJC: Microservices in the real worldLJC: Microservices in the real world
LJC: Microservices in the real worldChristopher Batey
 
Devoxx France: Fault tolerant microservices on the JVM with Cassandra
Devoxx France: Fault tolerant microservices on the JVM with CassandraDevoxx France: Fault tolerant microservices on the JVM with Cassandra
Devoxx France: Fault tolerant microservices on the JVM with CassandraChristopher Batey
 
2012 07 making disqus realtime@euro python
2012 07 making disqus realtime@euro python2012 07 making disqus realtime@euro python
2012 07 making disqus realtime@euro pythonAdam Hitchcock
 
13multithreaded Programming
13multithreaded Programming13multithreaded Programming
13multithreaded ProgrammingAdil Jafri
 
VISUG - Approaches for application request throttling
VISUG - Approaches for application request throttlingVISUG - Approaches for application request throttling
VISUG - Approaches for application request throttlingMaarten Balliauw
 
Integrate Solr with real-time stream processing applications
Integrate Solr with real-time stream processing applicationsIntegrate Solr with real-time stream processing applications
Integrate Solr with real-time stream processing applicationsthelabdude
 
Monitoring your Python with Prometheus (Python Ireland April 2015)
Monitoring your Python with Prometheus (Python Ireland April 2015)Monitoring your Python with Prometheus (Python Ireland April 2015)
Monitoring your Python with Prometheus (Python Ireland April 2015)Brian Brazil
 
Proof-of-Stake & Its Improvements (San Francisco Bitcoin Devs Hackathon)
Proof-of-Stake & Its Improvements (San Francisco Bitcoin Devs Hackathon)Proof-of-Stake & Its Improvements (San Francisco Bitcoin Devs Hackathon)
Proof-of-Stake & Its Improvements (San Francisco Bitcoin Devs Hackathon)Alex Chepurnoy
 
Tornado Web Server Internals
Tornado Web Server InternalsTornado Web Server Internals
Tornado Web Server InternalsPraveen Gollakota
 
Approaches to application request throttling
Approaches to application request throttlingApproaches to application request throttling
Approaches to application request throttlingMaarten Balliauw
 
Thread syncronization
Thread syncronizationThread syncronization
Thread syncronizationpriyabogra1
 
Java Concurrency, Memory Model, and Trends
Java Concurrency, Memory Model, and TrendsJava Concurrency, Memory Model, and Trends
Java Concurrency, Memory Model, and TrendsCarol McDonald
 
Introduction to Ethereum
Introduction to EthereumIntroduction to Ethereum
Introduction to EthereumArnold Pham
 
Campus HTC at #TechEX15
Campus HTC at #TechEX15Campus HTC at #TechEX15
Campus HTC at #TechEX15Rob Gardner
 
Post quantum cryptography in vault (hashi talks 2020)
Post quantum cryptography in vault (hashi talks 2020)Post quantum cryptography in vault (hashi talks 2020)
Post quantum cryptography in vault (hashi talks 2020)Mitchell Pronschinske
 
SwampDragon presentation: The Copenhagen Django Meetup Group
SwampDragon presentation: The Copenhagen Django Meetup GroupSwampDragon presentation: The Copenhagen Django Meetup Group
SwampDragon presentation: The Copenhagen Django Meetup GroupErnest Jumbe
 

Similaire à FaultTolerantMicroservicesBSkyb (20)

Voxxed Vienna 2015 Fault tolerant microservices
Voxxed Vienna 2015 Fault tolerant microservicesVoxxed Vienna 2015 Fault tolerant microservices
Voxxed Vienna 2015 Fault tolerant microservices
 
LJC: Microservices in the real world
LJC: Microservices in the real worldLJC: Microservices in the real world
LJC: Microservices in the real world
 
Devoxx France: Fault tolerant microservices on the JVM with Cassandra
Devoxx France: Fault tolerant microservices on the JVM with CassandraDevoxx France: Fault tolerant microservices on the JVM with Cassandra
Devoxx France: Fault tolerant microservices on the JVM with Cassandra
 
2012 07 making disqus realtime@euro python
2012 07 making disqus realtime@euro python2012 07 making disqus realtime@euro python
2012 07 making disqus realtime@euro python
 
13multithreaded Programming
13multithreaded Programming13multithreaded Programming
13multithreaded Programming
 
VISUG - Approaches for application request throttling
VISUG - Approaches for application request throttlingVISUG - Approaches for application request throttling
VISUG - Approaches for application request throttling
 
Integrate Solr with real-time stream processing applications
Integrate Solr with real-time stream processing applicationsIntegrate Solr with real-time stream processing applications
Integrate Solr with real-time stream processing applications
 
Monitoring your Python with Prometheus (Python Ireland April 2015)
Monitoring your Python with Prometheus (Python Ireland April 2015)Monitoring your Python with Prometheus (Python Ireland April 2015)
Monitoring your Python with Prometheus (Python Ireland April 2015)
 
Proof-of-Stake & Its Improvements (San Francisco Bitcoin Devs Hackathon)
Proof-of-Stake & Its Improvements (San Francisco Bitcoin Devs Hackathon)Proof-of-Stake & Its Improvements (San Francisco Bitcoin Devs Hackathon)
Proof-of-Stake & Its Improvements (San Francisco Bitcoin Devs Hackathon)
 
CDI: How do I ?
CDI: How do I ?CDI: How do I ?
CDI: How do I ?
 
Tornado Web Server Internals
Tornado Web Server InternalsTornado Web Server Internals
Tornado Web Server Internals
 
Approaches to application request throttling
Approaches to application request throttlingApproaches to application request throttling
Approaches to application request throttling
 
Thread syncronization
Thread syncronizationThread syncronization
Thread syncronization
 
Java Concurrency, Memory Model, and Trends
Java Concurrency, Memory Model, and TrendsJava Concurrency, Memory Model, and Trends
Java Concurrency, Memory Model, and Trends
 
Introduction to Ethereum
Introduction to EthereumIntroduction to Ethereum
Introduction to Ethereum
 
Ad Server Optimization
Ad Server OptimizationAd Server Optimization
Ad Server Optimization
 
Campus HTC at #TechEX15
Campus HTC at #TechEX15Campus HTC at #TechEX15
Campus HTC at #TechEX15
 
Java Concurrency
Java ConcurrencyJava Concurrency
Java Concurrency
 
Post quantum cryptography in vault (hashi talks 2020)
Post quantum cryptography in vault (hashi talks 2020)Post quantum cryptography in vault (hashi talks 2020)
Post quantum cryptography in vault (hashi talks 2020)
 
SwampDragon presentation: The Copenhagen Django Meetup Group
SwampDragon presentation: The Copenhagen Django Meetup GroupSwampDragon presentation: The Copenhagen Django Meetup Group
SwampDragon presentation: The Copenhagen Django Meetup Group
 

Plus de Christopher Batey

Docker and jvm. A good idea?
Docker and jvm. A good idea?Docker and jvm. A good idea?
Docker and jvm. A good idea?Christopher Batey
 
NYC Cassandra Day - Java Intro
NYC Cassandra Day - Java IntroNYC Cassandra Day - Java Intro
NYC Cassandra Day - Java IntroChristopher Batey
 
Cassandra Day NYC - Cassandra anti patterns
Cassandra Day NYC - Cassandra anti patternsCassandra Day NYC - Cassandra anti patterns
Cassandra Day NYC - Cassandra anti patternsChristopher Batey
 
Think your software is fault-tolerant? Prove it!
Think your software is fault-tolerant? Prove it!Think your software is fault-tolerant? Prove it!
Think your software is fault-tolerant? Prove it!Christopher Batey
 
Manchester Hadoop Meetup: Cassandra Spark internals
Manchester Hadoop Meetup: Cassandra Spark internalsManchester Hadoop Meetup: Cassandra Spark internals
Manchester Hadoop Meetup: Cassandra Spark internalsChristopher Batey
 
Cassandra London - 2.2 and 3.0
Cassandra London - 2.2 and 3.0Cassandra London - 2.2 and 3.0
Cassandra London - 2.2 and 3.0Christopher Batey
 
Cassandra London - C* Spark Connector
Cassandra London - C* Spark ConnectorCassandra London - C* Spark Connector
Cassandra London - C* Spark ConnectorChristopher Batey
 
3 Dundee-Spark Overview for C* developers
3 Dundee-Spark Overview for C* developers3 Dundee-Spark Overview for C* developers
3 Dundee-Spark Overview for C* developersChristopher Batey
 
Paris Day Cassandra: Use case
Paris Day Cassandra: Use caseParis Day Cassandra: Use case
Paris Day Cassandra: Use caseChristopher Batey
 
Dublin Meetup: Cassandra anti patterns
Dublin Meetup: Cassandra anti patternsDublin Meetup: Cassandra anti patterns
Dublin Meetup: Cassandra anti patternsChristopher Batey
 
Cassandra Day London: Building Java Applications
Cassandra Day London: Building Java ApplicationsCassandra Day London: Building Java Applications
Cassandra Day London: Building Java ApplicationsChristopher Batey
 
Data Science Lab Meetup: Cassandra and Spark
Data Science Lab Meetup: Cassandra and SparkData Science Lab Meetup: Cassandra and Spark
Data Science Lab Meetup: Cassandra and SparkChristopher Batey
 
Manchester Hadoop Meetup: Spark Cassandra Integration
Manchester Hadoop Meetup: Spark Cassandra IntegrationManchester Hadoop Meetup: Spark Cassandra Integration
Manchester Hadoop Meetup: Spark Cassandra IntegrationChristopher Batey
 
Manchester Hadoop User Group: Cassandra Intro
Manchester Hadoop User Group: Cassandra IntroManchester Hadoop User Group: Cassandra Intro
Manchester Hadoop User Group: Cassandra IntroChristopher Batey
 
Webinar Cassandra Anti-Patterns
Webinar Cassandra Anti-PatternsWebinar Cassandra Anti-Patterns
Webinar Cassandra Anti-PatternsChristopher Batey
 
Munich March 2015 - Cassandra + Spark Overview
Munich March 2015 -  Cassandra + Spark OverviewMunich March 2015 -  Cassandra + Spark Overview
Munich March 2015 - Cassandra + Spark OverviewChristopher Batey
 

Plus de Christopher Batey (20)

Cassandra summit LWTs
Cassandra summit  LWTsCassandra summit  LWTs
Cassandra summit LWTs
 
Docker and jvm. A good idea?
Docker and jvm. A good idea?Docker and jvm. A good idea?
Docker and jvm. A good idea?
 
NYC Cassandra Day - Java Intro
NYC Cassandra Day - Java IntroNYC Cassandra Day - Java Intro
NYC Cassandra Day - Java Intro
 
Cassandra Day NYC - Cassandra anti patterns
Cassandra Day NYC - Cassandra anti patternsCassandra Day NYC - Cassandra anti patterns
Cassandra Day NYC - Cassandra anti patterns
 
Think your software is fault-tolerant? Prove it!
Think your software is fault-tolerant? Prove it!Think your software is fault-tolerant? Prove it!
Think your software is fault-tolerant? Prove it!
 
Manchester Hadoop Meetup: Cassandra Spark internals
Manchester Hadoop Meetup: Cassandra Spark internalsManchester Hadoop Meetup: Cassandra Spark internals
Manchester Hadoop Meetup: Cassandra Spark internals
 
Cassandra London - 2.2 and 3.0
Cassandra London - 2.2 and 3.0Cassandra London - 2.2 and 3.0
Cassandra London - 2.2 and 3.0
 
Cassandra London - C* Spark Connector
Cassandra London - C* Spark ConnectorCassandra London - C* Spark Connector
Cassandra London - C* Spark Connector
 
IoT London July 2015
IoT London July 2015IoT London July 2015
IoT London July 2015
 
1 Dundee - Cassandra 101
1 Dundee - Cassandra 1011 Dundee - Cassandra 101
1 Dundee - Cassandra 101
 
2 Dundee - Cassandra-3
2 Dundee - Cassandra-32 Dundee - Cassandra-3
2 Dundee - Cassandra-3
 
3 Dundee-Spark Overview for C* developers
3 Dundee-Spark Overview for C* developers3 Dundee-Spark Overview for C* developers
3 Dundee-Spark Overview for C* developers
 
Paris Day Cassandra: Use case
Paris Day Cassandra: Use caseParis Day Cassandra: Use case
Paris Day Cassandra: Use case
 
Dublin Meetup: Cassandra anti patterns
Dublin Meetup: Cassandra anti patternsDublin Meetup: Cassandra anti patterns
Dublin Meetup: Cassandra anti patterns
 
Cassandra Day London: Building Java Applications
Cassandra Day London: Building Java ApplicationsCassandra Day London: Building Java Applications
Cassandra Day London: Building Java Applications
 
Data Science Lab Meetup: Cassandra and Spark
Data Science Lab Meetup: Cassandra and SparkData Science Lab Meetup: Cassandra and Spark
Data Science Lab Meetup: Cassandra and Spark
 
Manchester Hadoop Meetup: Spark Cassandra Integration
Manchester Hadoop Meetup: Spark Cassandra IntegrationManchester Hadoop Meetup: Spark Cassandra Integration
Manchester Hadoop Meetup: Spark Cassandra Integration
 
Manchester Hadoop User Group: Cassandra Intro
Manchester Hadoop User Group: Cassandra IntroManchester Hadoop User Group: Cassandra Intro
Manchester Hadoop User Group: Cassandra Intro
 
Webinar Cassandra Anti-Patterns
Webinar Cassandra Anti-PatternsWebinar Cassandra Anti-Patterns
Webinar Cassandra Anti-Patterns
 
Munich March 2015 - Cassandra + Spark Overview
Munich March 2015 -  Cassandra + Spark OverviewMunich March 2015 -  Cassandra + Spark Overview
Munich March 2015 - Cassandra + Spark Overview
 

Dernier

Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 

Dernier (20)

DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 

FaultTolerantMicroservicesBSkyb

  • 2. @chbatey Who is this guy? ● Enthusiastic nerd ● Senior software engineer at BSkyB ● Builds a lot of distributed applications ● Apache Cassandra MVP
  • 3. @chbatey Agenda 1. Setting the scene ○ What do we mean by a fault? ○ What is a microservice? ○ Monolith application vs the micro(ish) service 2. A worked example ○ Identify an issue ○ Reproduce/test it ○ Show how to deal with the issue
  • 4. So… what do applications look like? @chbatey
  • 5. So... what do systems look like now? @chbatey
  • 6. But different things go wrong... @chbatey down slow network slow app 2 second max GC :( missing packets
  • 7. Fault tolerance 1. Don’t take forever - Timeouts 2. Don’t try if you can’t succeed 3. Fail gracefully 4. Know if it’s your fault 5. Don’t whack a dead horse 6. Turn broken stuff off @chbatey
  • 8. Time for an example... ● All examples are on github ● Technologies used: @chbatey ○ Dropwizard ○ Spring Boot ○ Wiremock ○ Hystrix ○ Graphite ○ Saboteur
  • 9. Example: Movie player service @chbatey Shiny App User Service Device Service Pin Service Shiny App Shiny App Shiny App User Se rUvisceer Service Device Service Play Movie
  • 10. Testing microservices You don’t know a service is fault tolerant if you don’t test faults @chbatey
  • 11. Isolated service tests Shiny App @chbatey Mocks User Device Pin service Acceptance Play Movie Test Prime
  • 12. 1 - Don’t take forever @chbatey ● If at first you don’t succeed, don’t take forever to tell someone ● Timeout and fail fast
  • 13. Which timeouts? ● Socket connection timeout ● Socket read timeout @chbatey
  • 14. Your service hung for 30 seconds :( @chbatey Customer You :(
  • 15. Which timeouts? ● Socket connection timeout ● Socket read timeout ● Resource acquisition @chbatey
  • 16. Your service hung for 10 minutes :( @chbatey
  • 17. Let’s think about this @chbatey
  • 18. A little more detail @chbatey
  • 19. Wiremock + Saboteur + Vagrant ● Vagrant - launches + provisions local VMs ● Saboteur - uses tc, iptables to simulate @chbatey network issues ● Wiremock - used to mock HTTP dependencies ● Cucumber - acceptance tests
  • 20. I can write an automated test for that? @chbatey Vagrant + Virtual box VM Wiremock User Service Device Service Pin Service Sabot eur Play Movie Service Acceptance Test prime to drop traffic reset
  • 21. Implementing reliable timeouts ● Homemade: Worker Queue + Thread pool @chbatey (executor)
  • 22. Implementing reliable timeouts ● Homemade: Worker Queue + Thread pool @chbatey (executor) ● Hystrix
  • 23. Implementing reliable timeouts ● Homemade: Worker Queue + Thread pool @chbatey (executor) ● Hystrix ● Spring Cloud Netflix
  • 24. A simple Spring RestController @chbatey @RestController public class Resource { private static final Logger LOGGER = LoggerFactory.getLogger(Resource.class); @Autowired private ScaryDependency scaryDependency; @RequestMapping("/scary") public String callTheScaryDependency() { LOGGER.info("RestContoller: I wonder which thread I am on!"); return scaryDependency.getScaryString(); } }
  • 25. Scary dependency @chbatey @Component public class ScaryDependency { private static final Logger LOGGER = LoggerFactory.getLogger(ScaryDependency.class); public String getScaryString() { LOGGER.info("Scary dependency: I wonder which thread I am on!"); if (System.currentTimeMillis() % 2 == 0) { return "Scary String"; } else { Thread.sleep(10000); return "Really slow scary string"; } } }
  • 26. All on the tomcat thread 13:07:32.814 [http-nio-8080-exec-1] INFO info.batey. examples.Resource - RestContoller: I wonder which thread I am on! 13:07:32.896 [http-nio-8080-exec-1] INFO info.batey. examples.ScaryDependency - Scary dependency: I wonder which thread I am on! @chbatey
  • 27. Seriously this simple now? @chbatey @Component public class ScaryDependency { private static final Logger LOGGER = LoggerFactory.getLogger(ScaryDependency.class); @HystrixCommand public String getScaryString() { LOGGER.info("Scary dependency: I wonder which thread I am on!"); if (System.currentTimeMillis() % 2 == 0) { return "Scary String"; } else { Thread.sleep(10000); return "Really slow scary string"; } } }
  • 28. What an annotation can do... 13:07:32.814 [http-nio-8080-exec-1] INFO info.batey. examples.Resource - RestController: I wonder which thread I am on! 13:07:32.896 [hystrix-ScaryDependency-1] INFO info. batey.examples.ScaryDependency - Scary Dependency: I wonder which thread I am on! @chbatey
  • 29. Timeouts take home ● You can’t use network level timeouts for @chbatey SLAs ● Test your SLAs - if someone says you can’t, hit them with a stick ● Scary things happen without network issues
  • 30. 2 - Don’t try if you can’t succeed @chbatey
  • 31. Complexity ● When an application grows in complexity it will eventually start sending emails @chbatey
  • 32. Complexity ● When an application grows in complexity it will eventually start sending emails contain queues and thread pools @chbatey
  • 33. Don’t try if you can’t succeed ● Executor Unbounded queues :( ○ newFixedThreadPool ○ newSingleThreadExecutor ○ newThreadCachedThreadPool ● Bound your queues and threads ● Fail quickly when the queue / @chbatey maxPoolSize is met ● Know your drivers
  • 34. This is a functional requirement ● Set the timeout very high ● Use wiremock to add a large delay to the @chbatey requests ● Set queue size and thread pool size to 1 ● Send in 2 requests to use the thread and fill the queue ● What happens on the 3rd request?
  • 35. 3 - Fail gracefully @chbatey
  • 36. Expect rubbish ● Expect invalid HTTP ● Expect malformed response bodies ● Expect connection failures ● Expect huge / tiny responses @chbatey
  • 37. Testing with Wiremock @chbatey stubFor(get(urlEqualTo("/dependencyPath")) .willReturn(aResponse() .withFault(Fault.MALFORMED_RESPONSE_CHUNK))); { "request": { "method": "GET", "url": "/fault" }, "response": { "fault": "RANDOM_DATA_THEN_CLOSE" } } { "request": { "method": "GET", "url": "/fault" }, "response": { "fault": "EMPTY_RESPONSE" } }
  • 38. 4 - Know if it’s your fault @chbatey
  • 39. What to record ● Metrics: Timings, errors, concurrent incoming requests, thread pool statistics, connection pool statistics ● Logging: Boundary logging, elasticsearch / @chbatey logstash ● Request identifiers
  • 42. Separate resource pools ● Don’t flood your dependencies ● Be able to answer the questions: ○ How many connections will you make to dependency X? ○ Are you getting close to your @chbatey max connections?
  • 43. So easy with Dropwizard + Hystrix @Override public void initialize(Bootstrap<AppConfig> appConfigBootstrap) { HystrixCodaHaleMetricsPublisher metricsPublisher = new HystrixCodaHaleMetricsPublisher(appConfigBootstrap.getMetricRegistry()) HystrixPlugins.getInstance().registerMetricsPublisher(metricsPublisher); @chbatey } metrics: reporters: - type: graphite host: 192.168.10.120 port: 2003 prefix: shiny_app
  • 44. 5 - Don’t whack a dead horse @chbatey Shiny App User Service Device Service Pin Service Shiny App Shiny App Shiny App User Se rUvisceer Service Device Service Play Movie
  • 45. What to do.. ● Yes this will happen.. ● Mandatory dependency - fail *really* fast ● Throttling ● Fallbacks @chbatey
  • 47. Implementation with Hystrix @chbatey @GET @Timed public String integrate() { LOGGER.info("I best do some integration!"); String user = new UserServiceDependency(userService).execute(); String device = new DeviceServiceDependency(deviceService).execute(); Boolean pinCheck = new PinCheckDependency(pinService).execute(); return String.format("[User info: %s] n[Device info: %s] n[Pin check: %s] n", user, device, pinCheck); }
  • 48. Implementation with Hystrix public class PinCheckDependency extends HystrixCommand<Boolean> { @chbatey @Override protected Boolean run() throws Exception { HttpGet pinCheck = new HttpGet("http://localhost:9090/pincheck"); HttpResponse pinCheckResponse = httpClient.execute(pinCheck); String pinCheckInfo = EntityUtils.toString(pinCheckResponse.getEntity()); return Boolean.valueOf(pinCheckInfo); } }
  • 49. Implementation with Hystrix public class PinCheckDependency extends HystrixCommand<Boolean> { @chbatey @Override protected Boolean run() throws Exception { HttpGet pinCheck = new HttpGet("http://localhost:9090/pincheck"); HttpResponse pinCheckResponse = httpClient.execute(pinCheck); String pinCheckInfo = EntityUtils.toString(pinCheckResponse.getEntity()); return Boolean.valueOf(pinCheckInfo); } @Override public Boolean getFallback() { return true; } }
  • 50. Triggering the fallback ● Error threshold percentage ● Bucket of time for the percentage ● Minimum number of requests to trigger ● Time before trying a request again ● Disable ● Per instance statistics @chbatey
  • 51. 6 - Turn off broken stuff ● The kill switch @chbatey
  • 52. To recap 1. Don’t take forever - Timeouts 2. Don’t try if you can’t succeed 3. Fail gracefully 4. Know if it’s your fault 5. Don’t whack a dead horse 6. Turn broken stuff off @chbatey
  • 53. @chbatey Links ● Examples: ○ https://github.com/chbatey/spring-cloud-example ○ https://github.com/chbatey/dropwizard-hystrix ○ https://github.com/chbatey/vagrant-wiremock-saboteur ● Tech: ○ https://github.com/Netflix/Hystrix ○ https://www.vagrantup.com/ ○ http://wiremock.org/ ○ https://github.com/tomakehurst/saboteur
  • 54. Questions? ● Thanks for listening! ● http://christopher-batey.blogspot.co.uk/ @chbatey
  • 55. Developer takeaways ● Learn about TCP ● Love vagrant, docker etc to enable testing ● Don’t trust libraries @chbatey
  • 56. Hystrix cost - do this yourself @chbatey
  • 57. Hystrix metrics ● Failure count ● Percentiles from Hystrix @chbatey point of view ● Error percentages
  • 58. How to test metric publishing? ● Stub out graphite and verify calls? ● Programmatically call graphite and verify @chbatey numbers? ● Make metrics + logs part of the story demo