The speakers will describe the flexible configuration possibilities that Objectivity/DB provides, with an emphasis on how best to distribute data across multiple storage nodes. The session will start by describing the distributed processing architecture of Objectivity/DB before covering the new Placement Manager features. The speakers will also describe how Objectivity/DB compares and contrasts with other NoSQL solutions.
44. Placement
• The vast majority of DBMSs place data
– On disk using some type of organization.
– On different disks.
– On different nodes.
• A lot of what differentiates DBMSs is how they place
data.
• The way data is placed greatly affects the applicability
of the DBMS for certain models (e.g. relational,
document, graph) as well as different use cases.
– Grouping by sets is good for relational, but not so good for
document or graph.
45. Model Influences Placement
• Most DBMSs are model specific:
– Relational – Key-Value – Document – Graph
• They naturally layout (place) data in ways that best support
those models and the use cases they tend to attract.
– Relational DBMSs tend to place like type data together as that
performs best given their set based access.
– Key-Value DBMSs tend to place data according to key values
(hashed), as that makes looking up by key fast.
– Document DBMSs tend to place data in hierarchies that represent
documents to make working with a document fast.
– Graph DBMSs tend to place data according to its connectivity with
other data as that makes navigation fast.
46. Placement Configurability
• Performance is the main reason behind placement
choices. Orders of magnitude differences can occur
between efficient and inefficient data placements.
• DBMSs place data based on their understanding and
experience with the use cases that fit with the model
and their target markets.
• Knowing that they can’t in a general way place data
that is best for every application they usually make
placement configurable to some degree.
47. Typical Placement Configurability
• DBMSs usually provide configurability within the
general framework of placement they do:
– Table spaces, Fragmenting, partitioning, sharding.
– Node assignments.
– Limited clustering (e.g. across joins).
• Most do not provide the ability to change the
framework, in whole or in part.
– When you model or use data differently.
– When some of your data does not fit well with general
assumptions made by the framework.
48. Fully Configurable Placement
• Model independent.
• Choosing which objects (e.g. records/rows) are placed together
(clustering).
– Because they are going to be used together.
• Which ones are placed separately.
– Because they are going to be used separately and may be distributed
separately.
• Which ones you base a primary, secondary, tertiary, etc. organization on.
• Having different organizations for different data.
– Different types of data.
– The same types but used differently.
– Data used in different modes, e.g. demo vs. production.
• Distributing data based on what is used together and locality of use.
50. Placement Model
• The Placement Model is the means by which a database designer
expresses a placement design and what is used by the
Placement Manager when placing objects or helping the query
system look them up.
• A Placement Model primarily consists of rules and placers:
– Rule -- expresses the conditions that are to lead to a particular object
placer.
– Object Placer -- responsible for placing objects into containers and
keeping track of where they where placed.
– Container Placer -- responsible for placing containers into databases
and keeping track of where they were placed.
– Database Placer -- responsible for placing database files (including
those for its externalized containers) into storage locations (host/path
combinations) and keeping track where they were placed.
56. Place According to Value (Partition)
A
D
3
D
E
8
F
H
12
J
K
55
P
Q
106
S
Z
197
57. Placement Summary
• Placement has a huge influence on performance.
• You can mix and match all possible placement
arrangements.
• You can use different placement schemes for different parts
of your data.
• We place it, therefore we find it (efficiently).
• It is easy to try different placement schemes to see what
works best for you.
• You can have different placement schemes for demo vs.
production, etc.
• You can evolve your placement as models are versioned.