"This session focuses on delivering operationally robust deployments of MongoDB via specific design capabilities and varying data feeds. Learn how to use services or driver wrappers to unify design patterns for managing data. This talk will address the following questions:
How do you enforce a schema?
How do you redact or remove sensitive data in queries and feeds?
How do you detect and police ""out of profile"" queries and make sure they do not threaten your system?"
2. 2
Part 3 In The Data Management Series
Validating Data
Software Best Practices
Safe Leverage
From Relational
To MongoDB
Conquering
Data Proliferation
Bulletproof
Data Management
ç
Ω
Part
1
Part
2
Part
3
3. 3
Congratulations! At this Point You’ve:
• Created a Data Design
• Migrated Data
• Built a PoC or maybe an App
• Explored Operations
4. 4
The Next Stage: Defend & Leverage!
• Document Validation
• Redaction
• Quality Of Service
5. 5
MongoDB Doesn’t Have These Things
• Document Validation
• Redaction
• Quality Of Service
7. Write Some Code!
1. Focus on interfaces
2. Design for change
3. Keep application, data access layer,
data management logic, and database
i/o well-factored
4. Minimize compile-time binding
8. 8
Starting Point: The Data Access Layer
MongoDB
Java Driver
Data Access
Layer
Application
class DataAccessLayer {
private String authenicatedID;
private String effectiveID;
private Role role;
init() {
MongoClient mc = new MongoClient (args);
DB db = mc.getDB(args);
}
List getTransactions(Map predicate) {
Map mql = doWhateverYouNeed(predicate);
DBCollection coll = db.get(“TX”);
DBCursor c = coll.find(mql);
while(c.hasNext()) {
Map raw = (Map) c.getNext();
Map morphed = myMorphingLogic(raw);
list.add(morphed);
}
return list;
}
}
10. 10
A Query Filters Outbound Data
{$and:[{“name”:”buzz”},{“prefs”:{$exists:true}}]
11. 11
How About Using It To Filter Inbounds?
{$and:[{“name”:”buzz”},{“prefs”:{$exists:true}}]}
12. 12
$exists And $type Already in MQL
{“name”:{$type:2}}
{$or:[{“age”:{$exists:false}}, {“age”:{$type:16}} ]}
{$and: [
{$name: {$type:2}},
{$or:[
{$and:[{"weight”:{$type:16}}, {"height":{$type:16}}]}
,{$and:[{"weight”:{$exists:0}}, {"height":{$exists:0}}]}
]}
])
Ensure “name” exists (because not null) and is a string:
“age” optional but if exists must be a 32bit integer:
“name” required as string and weight and height both
required integers or both not present:
14. 14
A New MQL Validator Module Emerges
class MQLValidator {
ValidationResult validate(Map MQL, Map data)
}
MongoDB
Java Driver
Data Access
Layer
Application
Validator NOT inline to MongoDB driver
• Interface too big to create a façade
• Beware of “tall stacks”
MQLValidator
15. 15
MongoDB
DB Engine
Migrating Capability into MongoDB
MongoDB
Java Driver
MQLValidator
Java
Data Access
Layer
MongoDB
DB Engine
MongoDB
Java Driver
MQLValidator
Java
Data Access
Layer
• Coming in v3.2!
• Investment in validation design preserved
• Validation enforceable through ALL drivers
and languages
MongoDB
Python Driver
Application Application
16. 16
Code For The Future…Today
class DataAccessLayer {
someWriteOperation(Map data) {
if(ValidationEnabledInMongoDBengine) {
collection.insert(data); // Not yet
} else {
Map mql = getMQL(); // we’ll see this shortly!
// {$or:[{“age”:{$exists:false}},
// {“age”:{$type:16}}]}
ValidationResult vr = MQLValidator.validate(mql,data);
if(vr.ok()) {
collection.insert(data);
}
}
}
}
26. 26
The Stack So Far
MongoDB
Java Driver
MQLValidator
Data Access
Layer
Application
ValidatorDBUtils
ValidatorDBUtils populates an MQLValidator object from MongoDB
PQLFilter
27. 27
Representative Example
class DataAccessLayer {
MQLValidator vv = new MQLValidator(); // NOT DB dependent!
init() {
DB db = mongoClient.getDB( ”mydb" );
ValidatorDBUtils.populate(vv, db); // db.validations
}
someWriteOperation(Map data) {
if(ValidationEnabledInMongoDBengine) {
collection.insert(data); // Not yet
} else {
String vn = “appropriateValidationRulesName”;
ValidationResult vr = vv.validate(collname, vn, data))
if(vr.ok()) {
collection.insert(data);
}
}
}
}
29. 29
Concept: Post Query Operations (PQO)
{ ssn: { $hash: model }, birthdate: null }
{$and:[{“name”:”buzz”},{“prefs”:{$exists:true}}]
30. 30
Adopt MQL-like behavior
{“ssn”:null}
{“address”: “XXXX”}
{“ssn”: { $substitute: “ssnmodel” }}
Remove field by setting to null
Redact address with fixed value
Substitute SSN with a different, correct, consistent value
{“counterparty”: { $hash: “MD5” }}
Hash counterparty name to consistent value
31. 31
A New PostQuery Module Emerges
class PostQuery {
process(Map data, Map operations)
}
PostQuery
MongoDB
Java Driver
MQLValidator
Data Access
Layer
Application
ValidatorDBUtils
PQLFilter
41. 41
Representative Example
class DataAccessLayer {
MQLValidator vv = new MQLValidator(); // NOT DB dependent!
PostQuery pp = new PostQuery();
QOS qs = new QOS();
init() {
DB db = mongoClient.getDB( ”mydb" );
ValidatorDBUtils.populate(vv, db);
PQODBUtils.populate(pp, db);
QOSDBUtils.populate(qs, db);
}
someReadOperation(Map pred) {
Map mql = convertToMQL(pred);
String role = getRole(); // somehow
int maxms = qs.getMaxTime(“someReadOperation”, role);
Map data = collection.find(mql).maxtTime(maxms, tu);
String pqon = “appropriatePQORulesName”;
pp.process(collname, pqon, data); // in place update
return data;
}
}
42. 42
QOSDBUtils
A Highly Leveragable Investment
PostQuery
MQLValidator
Data Access
Layer 1
Application1
ValidatorDBUtils
PQLFilter
PQODBUtils
QOS
Application2
Data Access
Layer 2
Application3
Application4
Data Access
Layer 3
Application5
Application6
Reusable For ALL Data Access Layer Logic
43. 43
Not Just Java? Not A Problem
DAL operations have little or no state…
Data and MQL and diagnostics easily
and losslessly converted to and from
JSON…
Can you say … Web Service!
55. 55
The RESTful Provider
class RESTfulProvider implements DataProvider {
init() { // setup HTTP machine:port endpoint
fetch(String collection, Map mql) {
String jsonstr = JSONUtils.toJSON(mql);
String url = construct(collection, jsonstr);
// url is:
http://machine:port/collectionName?op=find&mql=‘{“produc
t”:”cleanser”,”expires”: {$gt: {$date: “20200101”}}}’
HTTPResponse res = call(url);
Map data = JSONUtils.fromJSON(res.getContent());
}
}
Notes de l'éditeur
HELLO!
This is Buzz Moschetti at MongoDB
Buy Subs, goddamit…! :-D
Some quick logistics.
In the last 5 to 10 mins today, we will answer the most common questions that have been submitted.
Some quick logistics.
In the last 5 to 10 mins today, we will answer the most common questions that have been submitted.
Some quick logistics.
In the last 5 to 10 mins today, we will answer the most common questions that have been submitted.
Some quick logistics.
In the last 5 to 10 mins today, we will answer the most common questions that have been submitted.
Some quick logistics.
In the last 5 to 10 mins today, we will answer the most common questions that have been submitted.
Important things to consider: a, b ,c
Not appearing in this film today:
Exception/errors and edge condition handling
Options in design WRT class inheritance, per-thread (or more) DAL models vs. static methods, cartridge models, etc.
In particular, we will see
Some quick logistics.
In the last 5 to 10 mins today, we will answer the most common questions that have been submitted.
Some quick logistics.
In the last 5 to 10 mins today, we will answer the most common questions that have been submitted.
Some quick logistics.
In the last 5 to 10 mins today, we will answer the most common questions that have been submitted.
Some quick logistics.
In the last 5 to 10 mins today, we will answer the most common questions that have been submitted.
Some quick logistics.
In the last 5 to 10 mins today, we will answer the most common questions that have been submitted.
Some quick logistics.
In the last 5 to 10 mins today, we will answer the most common questions that have been submitted.
Some quick logistics.
In the last 5 to 10 mins today, we will answer the most common questions that have been submitted.
Some quick logistics.
In the last 5 to 10 mins today, we will answer the most common questions that have been submitted.
Some quick logistics.
In the last 5 to 10 mins today, we will answer the most common questions that have been submitted.
Some quick logistics.
In the last 5 to 10 mins today, we will answer the most common questions that have been submitted.
Some quick logistics.
In the last 5 to 10 mins today, we will answer the most common questions that have been submitted.
Some quick logistics.
In the last 5 to 10 mins today, we will answer the most common questions that have been submitted.
Some quick logistics.
In the last 5 to 10 mins today, we will answer the most common questions that have been submitted.
Some quick logistics.
In the last 5 to 10 mins today, we will answer the most common questions that have been submitted.
Some quick logistics.
In the last 5 to 10 mins today, we will answer the most common questions that have been submitted.
Some quick logistics.
In the last 5 to 10 mins today, we will answer the most common questions that have been submitted.
Some quick logistics.
In the last 5 to 10 mins today, we will answer the most common questions that have been submitted.
Some quick logistics.
In the last 5 to 10 mins today, we will answer the most common questions that have been submitted.
In addition to being well factored, this permits the DAL to contain both DB-persisted validation and dynamic, business data driven validation managed by the SAME code set with the SAME expression language.
Some quick logistics.
In the last 5 to 10 mins today, we will answer the most common questions that have been submitted.
Some quick logistics.
In the last 5 to 10 mins today, we will answer the most common questions that have been submitted.
Some quick logistics.
In the last 5 to 10 mins today, we will answer the most common questions that have been submitted.
Some quick logistics.
In the last 5 to 10 mins today, we will answer the most common questions that have been submitted.
Some quick logistics.
In the last 5 to 10 mins today, we will answer the most common questions that have been submitted.
NOTE: NO mention of user and role here!
We only define a set of ops by rule name.
Something else has to associate these with users and roles.
ALSO: Nuance between entitlements set up on DB vs. entitlements at “user level.” Consider heathrow airport: You are entitled to see things but if not in home network, you cannot see SSN.
Some quick logistics.
In the last 5 to 10 mins today, we will answer the most common questions that have been submitted.
As mentioned before re. something else associated rules and roles.
Some quick logistics.
In the last 5 to 10 mins today, we will answer the most common questions that have been submitted.
Blackout is simple: permit or deny based on any number of factors.
maxTime: Engine time, not wall clock time; a good proxy for actual load on the engine.
Some quick logistics.
In the last 5 to 10 mins today, we will answer the most common questions that have been submitted.
Some quick logistics.
In the last 5 to 10 mins today, we will answer the most common questions that have been submitted.
Some quick logistics.
In the last 5 to 10 mins today, we will answer the most common questions that have been submitted.
Some quick logistics.
In the last 5 to 10 mins today, we will answer the most common questions that have been submitted.
Some quick logistics.
In the last 5 to 10 mins today, we will answer the most common questions that have been submitted.
Some quick logistics.
In the last 5 to 10 mins today, we will answer the most common questions that have been submitted.
Some quick logistics.
In the last 5 to 10 mins today, we will answer the most common questions that have been submitted.
Some quick logistics.
In the last 5 to 10 mins today, we will answer the most common questions that have been submitted.
Some quick logistics.
In the last 5 to 10 mins today, we will answer the most common questions that have been submitted.
Some quick logistics.
In the last 5 to 10 mins today, we will answer the most common questions that have been submitted.
Some quick logistics.
In the last 5 to 10 mins today, we will answer the most common questions that have been submitted.
Some quick logistics.
In the last 5 to 10 mins today, we will answer the most common questions that have been submitted.
Some quick logistics.
In the last 5 to 10 mins today, we will answer the most common questions that have been submitted.
Some quick logistics.
In the last 5 to 10 mins today, we will answer the most common questions that have been submitted.
On behalf of all of us at MongoDB , thank you for attending this webinar!
I hope what you saw and heard today gave you some insight and clues into what you might face in your own schema design efforts.
Remember you can always reach out to us at MongoDB for guidance.
With that, code well and be well.
Some quick logistics.
In the last 5 to 10 mins today, we will answer the most common questions that have been submitted.
Some quick logistics.
In the last 5 to 10 mins today, we will answer the most common questions that have been submitted.