2. Highlights
Average latency
0.146973 milliseconds per message (146,973.09 nanoseconds) with no JVM tuning
over 100,000 price samples (~6 million (6x106) messages / minute)
What are timings based upon?
Receive a byte array representing a price, deserialize it, validate, cache, apply 100 internal
price rungs on 1 ladder, apply 100 external price rungs on 3 external margin groups, and
serialize the results for delivery to an order entry system.
Machine: (PEPOC2 in BUD)
Dell PowerEdge R200
Intel(R) Core(TM)2 Duo CPU E4500 @ 2.20GHz
4GB RAM4GB RAM
Complexity
Lines of code: ~1400
XS Structural Complexity rating (Structure 101): 0
3. Core Processing Logic Principles
Price Calculations – design of price structure
Price objects are built as the move through the system
Prices should not contain more information than required at any given
point
Minimize functionality and focus on data in prices
Price objects get larger (more complex) as they contain more information
but are eventually condensed into a quote
The relationship between prices should be easy to understand and simple
to implement
Tangled business logic can be implemented in processing logic rather than
in price
Fast serialization/de-serialization of objects is mandatory
We used Google Protocol Buffers for speed but could equally use JMS or
another transport layer
4. Architecture Performance Principles
First rule of distributed applications: Don’t
Collocate as much as possible
Performance
Avoid network overhead
Avoid serialization (time consuming, excessive object creation)
Simplification
Code, configuration, monitoring, and management
Use lock-free CAS operations where possible
Avoid contention
Separate mutable and immutable data
Share immutable objects
Reduce unnecessary object creation
Limit system jitter and latency from GC cycles
Price objects are only updated by a single thread
No synchronization, less complexity
5. Architecture Modularity
Goals
Enforce separation of concerns between subsystems
All subsystems communicate via an API
Subsystem implementation classes and artifacts are not visible outside of their
module
Ability to replace subsystems
Extraction, internal pricing, external pricing, distribution
Architected extension points
Defined SPI for extending the Price Engine
Allow customers to modify default behaviors
Transport independence – use JMS, ZeroMQ, other transport layer, without code
changes
No significant runtime performance impact
Small startup impact for classloading due to OSGi filters
7. Data Model
Data Model predicated on simplicity, flexibility and matching
business flow
Object hierarchy simplified into manageable hierarchy that is
still flexible
Adheres to the following principles
Only carry the data you need
Use primitive types when available
Keep objects and dumb containers for data as much as possible
Minimize network object size
E.g. incoming prices are 55 bytes
9. Flows
System architecture built on a series of Flows that contain 1..N
Processors which operate on a Price in a given order
A module contains one or more flows
The terminating Processer in a Flow calls the next Flow
Flows are synchronous
Avoid context switching, threading overhead, and need to synchronize
price objects
Only one thread modifies a price object
Simplify programming
Asynchronous programming more complex
System is multi-threaded
A flow may be running on multiple threads
Can spin-off async tasks and delivery intermediate data as necessary
11. Productivity
Designed as a Maven multi-module project
● Zero-configuration setup: git clone… mvn clean install
Automated integration testing
● Full end-to-end in-container system testing as part of local
Maven build
● Can be run as part of a CI build
Integrated Caliper microbenchmark framework
● http://code.google.com/p/caliper/