Balancing Replication and Partitioning in a Distributed Java Database

In-memory databases offer
significant gains in But three issues
performance as all data is have stunted their
freely available. There is no uptake: Address
Traditional disk- spaces only being
need to page to and from disk.
oriented database large enough for a
architecture is subset of a typical
showing its age. This makes joins a user’s data. The ‘one
problem. When data more bit’ problem
must be joined across and durability.
multiple machines
performance degradation
Snowflake is inevitable. Distributed in-
Schemas allow memory databases
us to mix But this model only goes solve these three
Partitioning so far. “Connected problems but at the
and Replication Replication” takes us a price of loosing the
so joins never step further allowing us single address space.
hit the wire. to make the best possible
use of replication.

The lay of the land:
The main architectural
constructs in the database
industry

ms
μs
ns
ps

1MB Disk/Network
1MB Main Memory

0.000,000,000,000
Cross Continental Main Memory
L1 Cache Ref
Round Trip
Ref
Cross Network L2 Cache Ref
Round Trip
* L1 ref is about 2 clock cycles or 0.7ns. This is
the time it takes light to travel 20cm

Taken from “OLTP Through
the Looking Glass, and What
We Found There”
Harizopoulos et al

Shared
Nothing

Teradata, Vertica,
Greenplumb…

SN
Regular In-Memory
In-Memory
Database
Database
Drop Disk
Exasol, VoltDB,
Oracle, Sybase,
MySql
Times Ten, HSQL,
KDB
Distribute
Hana

ODC
Distributed
Caching
Coherence, Gemﬁre,
Gigaspaces

Distributed Architecture

Simplify the Contract.

Stick to RAM

450 processes
2TB of RAM
Oracle
Coherence

Messaging (Topic Based) as a system of record
(persistence)

Access Layer Java Java
client
client

API
API
Query Layer

Transactions
Data Layer

Mtms

Cashflows
Persistence Layer

Indexing

Partitioning
Replication

But your storage is limited by
the memory on a node

Keys Fs-Fz
Keys Xa-Yd

Scalable storage, bandwidth
and processing

Trader
Party
Version 1
Trade
Trader
Party
Version 2
Trade
Trader
Party
Version 3
Trade
Trader
Party
Version 4
Trade
…and you need
versioning to do MVCC

Trade
Trader
Party

Party
Trader
Trade

Party
Trader

Trade
Party

So better to use
partitioning, spreading
data around the
cluster.

Trader
Party

Trade

Trade
Trader
Party

!
This is what using Snowﬂake Schemas and
the Connected Replication pattern is all
about!

Crosscutting
Keys

Common
Keys

Replicated
Trader
Party

Trade
Partitioned

Valuation Legs

Valuations

art Transaction Mapping

Cashflow Mapping Facts:
Party Alias

Transaction
=>Big,
Cashflows common
Legs

Parties
keys
Ledger Book

Source Book
Dimensions
Cost Centre

Product =>Small,
Risk Organisation Unit

Business Unit
crosscutting
HCS Entity Keys
Set of Books

0 37,500,000 75,000,000 112,500,000 150,000,000

Coherence’s
KeyAssociation
gives us this
Trades
MTMs

Common
Key

Replicated
Trader
Party

Trade
Partitioned
(

Query Layer
Trader
Party

Trade

Transactions

Data Layer
Mtms

Cashflows

Fact Storage
(Partitioned)

Dimensions
(repliacte)

Transactions

Mtms
Facts
Cashflows
(distribute/
partition)
Fact Storage
(Partitioned)

Valuation Legs

Valuations

Facts:
art Transaction Mapping

Cashflow Mapping

Party Alias
=>Big
=>Distribute
Transaction

Cashflows

Legs

Parties

Ledger Book

Source Book Dimensions
=>Small
Cost Centre

Product

=> Replicate
Business Unit

HCS Entity

Set of Books

0 37,500,000 75,000,000 112,500,000 150,000,000

We use a variant on a
Snowﬂake Schema to
partition big stuff, that has
the same key and replicate
small stuff that has
crosscutting keys.

Select Transaction, MTM, ReferenceData From
MTM, Transaction, Ref Where Cost Centre = ‘CC1’


LBs[]=getLedgerBooksFor(CC1)
SBs[]=getSourceBooksFor(LBs[])
So we have all the bottom level
dimensions needed to query facts

Transactions

Mtms

Cashflows

Partitioned


LBs[]=getLedgerBooksFor(CC1)
SBs[]=getSourceBooksFor(LBs[])

Transactions

Get all Transactions and
Mtms
MTMs (cluster side join) for
the passed Source Books
Cashflows

Partitioned


Populate raw facts LBs[]=getLedgerBooksFor(CC1)
(Transactions) with SBs[]=getSourceBooksFor(LBs[])
dimension data
before returning to
client.

Transactions

Get all Transactions and
Mtms
MTMs (cluster side join) for
the passed Source Books
Cashflows

Partitioned

Replicated Partitioned
Java
client

Dimensions
Facts
API

We never have to do a distributed join!

So all the big stuff is
held paritioned

And we can join
without shipping keys
around and having
intermediate results

Trader
Party
Version 1
Trade
Trader
Party
Version 2
Trade
Trader
Party
Version 3
Trade
Trader
Party
Version 4
Trade

Valuation Legs

Valuations

rt Transaction Mapping

Cashflow Mapping

Party Alias
Facts
Transaction

Cashflows

Legs

Parties This is a dimension
Ledger Book
•  It has a different
Source Book

Cost Centre key to the Facts.
Dimensions
Product
•  And it’s BIG

Business Unit

HCS Entity

Set of Books

0 125,000,000

Party Alias

Parties

Ledger Book

Source Book

Cost Centre

Product


Business Unit

HCS Entity

Set of Books

0 1,250,000 2,500,000 3,750,000 5,000,000

Party Alias

Parties

Ledger Book

Source Book

Cost Centre

Product


Business Unit

HCS Entity

Set of Books

20 1,250,015 2,500,010 3,750,005 5,000,000

So we only replicate
‘Connected’ or ‘Used’
dimensions

Processing Layer
Dimension Caches
(Replicated)

Transactions

Data Layer
As new Facts are added Mtms
relevant Dimensions that
they reference are moved
Cashflows
to processing layer caches

Fact Storage
(Partitioned)

Query Layer
Save Trade
(With connected
dimension Caches)

Data Layer
Cache
Trade
(All Normalised)
Store

Partitioned
Trigger
Source Cache
Party Ccy
Alias
Book

Query Layer
(With connected
dimension Caches)

Data Layer
Trade
(All Normalised)

Party Source Ccy
Alias
Book

Query Layer
(With connected
dimension Caches)

Data Layer
Trade
(All Normalised)

Party Source Ccy
Alias
Book

Party
Ledger

Book

‘Connected Replication’
A simple pattern which
recurses through the foreign
keys in the domain model,
ensuring only ‘Connected’
dimensions are replicated

Java
client

Java schema
API
Java ‘Stored
Procedures’
and ‘Triggers’

Balancing Replication and Partitioning in a Distributed Java Database

Balancing Replication and Partitioning in a Distributed Java Database

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (20)

Similaire à Balancing Replication and Partitioning in a Distributed Java Database

Similaire à Balancing Replication and Partitioning in a Distributed Java Database (20)

Plus de Ben Stopford

Plus de Ben Stopford (20)

Dernier

Dernier (20)

Balancing Replication and Partitioning in a Distributed Java Database

Notes de l'éditeur