Gen AI in Business - Global Trends Report 2024.pdf
2011 july-gtug-high-replication-datastore
1. High Replication
Datastore
Ikai Lan
plus.ikailan.com
NYC GTUG
July 27, 2011
Wednesday, July 27, 2011
2. About the speaker
• Ikai Lan
• Developer Relations at Google based out
of San Francisco, CA
• Twitter: @ikai
• Google+: plus.ikailan.com
Wednesday, July 27, 2011
3. Agenda
• What is App Engine?
• What is High Replication datastore?
• Underneath the hood
Wednesday, July 27, 2011
15. App Engine
Datastore
Schemaless, non-relational
datastore built on top of
Google’s Bigtable technology
Enables rapid development
and scalability
Wednesday, July 27, 2011
16. High Replication
• strongly consistent
• multi datacenter
• High reliability
• consistent
performance
• no data loss
Wednesday, July 27, 2011
17. How do I use HR?
• Create a new application! Just remember
the rules
• Fetch by key and ancestor queries exhibit
strongly consistent behavior
• Queries without an ancestor exhibit
eventually consistent behavior
Wednesday, July 27, 2011
18. Strong vs. Eventual
• Strong consistency means immediately after
the datastore tells us the data has been
committed, a subsequent read will return
the data written
• Eventual consistency means that some time
after the datastore tells us data has been
committed, a read will return written data -
immediate read may or may not
Wednesday, July 27, 2011
19. This is strongly
consistent
DatastoreService datastore = DatastoreServiceFactory
.getDatastoreService();
Entity item = new Entity("Item");
item.setProperty("data", 123);
Key key = datastore.put(item);
// This exhibits strong consistency.
// It should return the item we just saved.
Entity result = datastore.get(key);
Wednesday, July 27, 2011
20. This is strongly
consistent
// Save the entity root
Entity root = new Entity("Root");
Key rootKey = datastore.put(root);
// Save the child
Entity childItem = new Entity("Item", rootKey);
childItem.setProperty("data", 123);
datastore.put(childItem);
Query strongConsistencyQuery = new Query("Item");
strongConsistencyQuery.setAncestor(rootKey);
strongConsistencyQuery.addFilter("data", FilterOperator.EQUAL, 123);
FetchOptions opts = FetchOptions.Builder.withDefaults();
// This query exhibits strong consistency.
// It will return the item we just saved.
List<Entity> results = datastore.prepare(strongConsistencyQuery)
.asList(opts);
Wednesday, July 27, 2011
21. This is eventually
consistent
Entity item = new Entity("Item");
item.setProperty("data", 123);
datastore.put(item);
// Not an ancestor query
Query eventuallyConsistentQuery = new Query("Item");
eventuallyConsistentQuery.addFilter("data", FilterOperator.EQUAL, 123);
FetchOptions opts = FetchOptions.Builder.withDefaults();
// This query exhibits eventual consistency.
// It will likely return an empty list.
List<Entity> results = datastore.prepare(eventuallyConsistentQuery)
.asList(opts);
Wednesday, July 27, 2011
22. Why?
• Reads are transactional
• On a read, we try to determine if we have
the latest version of some data
• If not, we catch up the data on the node to
the latest version
Wednesday, July 27, 2011
23. To understand this ...
• We need some understanding of Paxos ...
• ... which necessitates some understanding
of transactions
• ... which necessitates some understanding
of entity groups
Wednesday, July 27, 2011
24. Entity Groups
Entity
User
group root
Blog Blog
Entry Entry Entry
Comment
Comment Comment
Wednesday, July 27, 2011
25. Entity groups
// Save the entity root
Entity root = new Entity("Root");
Key rootKey = datastore.put(root);
// Save the child
Entity childItem = new Entity("Item", rootKey);
childItem.setProperty("data", 123);
datastore.put(childItem);
Query strongConsistencyQuery = new Query("Item");
strongConsistencyQuery.setAncestor(rootKey);
strongConsistencyQuery.addFilter("data", FilterOperator.EQUAL, 123);
FetchOptions opts = FetchOptions.Builder.withDefaults();
// This query exhibits strong consistency.
// It will return the item we just saved.
List<Entity> results = datastore.prepare(strongConsistencyQuery)
.asList(opts);
Wednesday, July 27, 2011
26. Entity groups
// Save the entity root
Entity root = new Entity("Root");
Key rootKey = datastore.put(root);
// Save the child
Entity childItem = new Entity("Item", rootKey);
childItem.setProperty("data", 123);
datastore.put(childItem);
Query strongConsistencyQuery = new Query("Item");
strongConsistencyQuery.setAncestor(rootKey);
strongConsistencyQuery.addFilter("data", FilterOperator.EQUAL, 123);
FetchOptions opts = FetchOptions.Builder.withDefaults();
// This query exhibits strong consistency.
// It will return the item we just saved.
List<Entity> results = datastore.prepare(strongConsistencyQuery)
.asList(opts);
Wednesday, July 27, 2011
27. Optimistic locking
Client A reads Client B
data. It's reads data.
current It's current
version is 11 version is 11
Modify data. Modify data.
Increment version Increment version
to 12 Datastore to 12
Client B tries
Client ! tries to to save data.
save data. Success!
Datastore
version is
higher or equal
than my
version - FAIL
Wednesday, July 27, 2011
28. Transactional reads
// Save the entity root
Entity root = new Entity("Root");
Key rootKey = datastore.put(root);
// Save the child
Entity childItem = new Entity("Item", rootKey);
childItem.setProperty("data", 123);
datastore.put(childItem);
Query strongConsistencyQuery = new Query("Item");
strongConsistencyQuery.setAncestor(rootKey);
strongConsistencyQuery.addFilter("data", FilterOperator.EQUAL, 123);
FetchOptions opts = FetchOptions.Builder.withDefaults();
// This query exhibits strong consistency.
// It will return the item we just saved.
List<Entity> results = datastore.prepare(strongConsistencyQuery)
.asList(opts);
Wednesday, July 27, 2011
29. Transactional reads
Still being committed
Blog Entry
Version 11
Comment Comment
Parent: Entry Parent: Entry
Version 11 Version 12
Client B
Client A reads
Datastore transactionally
data
writing data
Version 12 has not finished committing -
Datastore returns version 11
Wednesday, July 27, 2011
30. Paxos simplified
Give me the
newest data Node A Node B
Datastore
Client
Is my data
up to date?
Node C Node D
1. If the data is up to date, return it
2. if the data is NOT up to date, "catch up" the data
by applying the jobs in the journal and return the latest
data
Wednesday, July 27, 2011
31. More reading
• My example was grossly oversimplified
• More details can be found here:
http://www.cidrdb.org/cidr2011/Papers/
CIDR11_Paper32.pdf
Wednesday, July 27, 2011
32. Contradictory advice
• Entity groups must be as big as possible to
cover as much related data as you can
• Entity groups must be small enough such
that your write rate per entity group never
goes above one write/second
Wednesday, July 27, 2011
33. Summary
• Remember the rules of strong consistency
and eventual consistency
• Group your data into entity groups when
possible and use ancestor queries
Wednesday, July 27, 2011