Ce diaporama a bien été signalé.
Le téléchargement de votre SlideShare est en cours. ×

Entity Relationships in a Document Database at CouchConf Boston

Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Prochain SlideShare
CouchDB at New York PHP
CouchDB at New York PHP
Chargement dans…3
×

Consultez-les par la suite

1 sur 77 Publicité

Entity Relationships in a Document Database at CouchConf Boston

Télécharger pour lire hors ligne

Unlike relational databases, document databases like CouchDB and Couchbase do not directly support entity relationships. This talk will explore patterns of modeling one-to-many and many-to-many entity relationships in a document database. These patterns include using an embedded JSON array, relating documents using identifiers, using a list of keys, and using relationship documents.

Unlike relational databases, document databases like CouchDB and Couchbase do not directly support entity relationships. This talk will explore patterns of modeling one-to-many and many-to-many entity relationships in a document database. These patterns include using an embedded JSON array, relating documents using identifiers, using a list of keys, and using relationship documents.

Publicité
Publicité

Plus De Contenu Connexe

Diaporamas pour vous (20)

Similaire à Entity Relationships in a Document Database at CouchConf Boston (20)

Publicité

Plus récents (20)

Publicité

Entity Relationships in a Document Database at CouchConf Boston

  1. 1. Entity Relationships in a Document Database MapReduce Views for SQL Users
  2. 2. Entity: An object defined by its identity and a thread of continuity[1] 1. "Entity" Domain-driven Design Community <http://domaindrivendesign.org/node/109>.
  3. 3. Entity Relationship Model
  4. 4. Join vs. Collation
  5. 5. SQL Query Joining Publishers and Books SELECT `publisher`.`id`, `publisher`.`name`, `book`.`title` FROM `publisher` FULL OUTER JOIN `book` ON `publisher`.`id` = `book`.`publisher_id` ORDER BY `publisher`.`id`, `book`.`title`;
  6. 6. Joined Result Set publisher.id publisher.name book.title Building iPhone Apps with oreilly O'Reilly Media HTML, CSS, and JavaScript CouchDB: The Definitive oreilly O'Reilly Media Guide DocBook: The Definitive oreilly O'Reilly Media Guide oreilly O'Reilly Media RESTful Web Services
  7. 7. Joined Result Set Publisher (“left”) publisher.id publisher.name book.title Building iPhone Apps with oreilly O'Reilly Media HTML, CSS, and JavaScript CouchDB: The Definitive oreilly O'Reilly Media Guide DocBook: The Definitive oreilly O'Reilly Media Guide oreilly O'Reilly Media RESTful Web Services
  8. 8. Joined Result Set Publisher (“left”) Book “right” publisher.id publisher.name book.title Building iPhone Apps with oreilly O'Reilly Media HTML, CSS, and JavaScript CouchDB: The Definitive oreilly O'Reilly Media Guide DocBook: The Definitive oreilly O'Reilly Media Guide oreilly O'Reilly Media RESTful Web Services
  9. 9. Collated Result Set key id value ["oreilly",0] "oreilly" "O'Reilly Media" "Building iPhone Apps with ["oreilly",1] "oreilly" HTML, CSS, and JavaScript" "CouchDB: The Definitive ["oreilly",1] "oreilly" Guide" "DocBook: The Definitive ["oreilly",1] "oreilly" Guide" ["oreilly",1] "oreilly" "RESTful Web Services"
  10. 10. Collated Result Set key id value ["oreilly",0] "oreilly" "O'Reilly Media" Publisher "Building iPhone Apps with ["oreilly",1] "oreilly" HTML, CSS, and JavaScript" "CouchDB: The Definitive ["oreilly",1] "oreilly" Guide" "DocBook: The Definitive ["oreilly",1] "oreilly" Guide" ["oreilly",1] "oreilly" "RESTful Web Services"
  11. 11. Collated Result Set key id value ["oreilly",0] "oreilly" "O'Reilly Media" Publisher "Building iPhone Apps with ["oreilly",1] "oreilly" HTML, CSS, and JavaScript" "CouchDB: The Definitive ["oreilly",1] "oreilly" Guide" Books "DocBook: The Definitive ["oreilly",1] "oreilly" Guide" ["oreilly",1] "oreilly" "RESTful Web Services"
  12. 12. View Result Sets Made up of columns and rows Every row has the same three columns: • key • id • value Columns can contain a mixture of logical data types
  13. 13. One to Many Relationships
  14. 14. Embedded Entities: Nest related entities within a document
  15. 15. Embedded Entities A single document represents the “one” entity Nested entities (JSON Array) represents the “many” entities Simplest way to create a one to many relationship
  16. 16. Example: Publisher with Nested Books { "_id":"oreilly", "collection":"publisher", "name":"O'Reilly Media", "books":[ { "title":"CouchDB: The Definitive Guide" }, { "title":"RESTful Web Services" }, { "title":"DocBook: The Definitive Guide" }, { "title":"Building iPhone Apps with HTML, CSS, and JavaScript" } ] }
  17. 17. Map Function function(doc) { if ("publisher" == doc.collection) { emit([doc._id, 0], doc.name); for (var i in doc.books) { emit([doc._id, 1], doc.books[i].title); } } }
  18. 18. Result Set key id value ["oreilly",0] "oreilly" "O'Reilly Media" "Building iPhone Apps with ["oreilly",1] "oreilly" HTML, CSS, and JavaScript" "CouchDB: The Definitive ["oreilly",1] "oreilly" Guide" "DocBook: The Definitive ["oreilly",1] "oreilly" Guide" ["oreilly",1] "oreilly" "RESTful Web Services"
  19. 19. Limitations Only works if there aren’t a large number of related entities: • Too many nested entities can result in very large documents • Slow to transfer between client and server • Unwieldy to modify • Time-consuming to index
  20. 20. Related Documents: Reference an entity by its identifier
  21. 21. Related Documents A document representing the “one” entity Separate documents for each “many” entity Each “many” entity references its related “one” entity by the “one” entity’s document identifier Makes for smaller documents Reduces the probability of document update conflicts
  22. 22. Example: Publisher { "_id":"oreilly", "collection":"publisher", "name":"O'Reilly Media" }
  23. 23. Example: Related Book { "_id":"9780596155896", "collection":"book", "title":"CouchDB: The Definitive Guide", "publisher":"oreilly" }
  24. 24. Map Function function(doc) { if ("publisher" == doc.collection) { emit([doc._id, 0], doc.name); } if ("book" == doc.collection) { emit([doc.publisher, 1], doc.title); } }
  25. 25. Result Set key id value ["oreilly",0] "oreilly" "O'Reilly Media" "CouchDB: The Definitive ["oreilly",1] "9780596155896" Guide" ["oreilly",1] "9780596529260" "RESTful Web Services" "Building iPhone Apps with ["oreilly",1] "9780596805791" HTML, CSS, and JavaScript" "DocBook: The Definitive ["oreilly",1] "9781565925809" Guide"
  26. 26. Limitations When retrieving the entity on the “right” side of the relationship, one cannot include any data from the entity on the “left” side of the relationship without the use of an additional query Only works for one to many relationships
  27. 27. Many to Many Relationships
  28. 28. List of Keys: Reference entities by their identifiers
  29. 29. List of Keys A document representing each “many” entity on the “left” side of the relationship Separate documents for each “many” entity on the “right” side of the relationship Each “many” entity on the “right” side of the relationship maintains a list of document identifiers for its related “many” entities on the “left” side of the relationship
  30. 30. Books and Related Authors
  31. 31. Example: Book { "_id":"9780596805029", "collection":"book", "title":"DocBook 5: The Definitive Guide" }
  32. 32. Example: Book { "_id":"9781565920514", "collection":"book", "title":"Making TeX Work" }
  33. 33. Example: Book { "_id":"9781565925809", "collection":"book", "title":"DocBook: The Definitive Guide" }
  34. 34. Example: Author { "_id":"muellner", "collection":"author", "name":"Leonard Muellner", "books":[ "9781565925809" ] }
  35. 35. Example: Author { "_id":"walsh", "collection":"author", "name":"Norman Walsh", "books":[ "9780596805029", "9781565925809", "9781565920514" ] }
  36. 36. Map Function function(doc) { if ("book" == doc.collection) { emit([doc._id, 0], doc.title); } if ("author" == doc.collection) { for (var i in doc.books) { emit([doc.books[i], 1], doc.name); } } }
  37. 37. Result Set key id value ["9780596805029",0] "9780596805029" "DocBook 5: The Definitive Guide" ["9780596805029",1] "walsh" "Norman Walsh" ["9781565920514",0] "9781565920514" "Making TeX Work" ["9781565920514",1] "walsh" "Norman Walsh" ["9781565925809",0] "9781565925809" "DocBook: The Definitive Guide" ["9781565925809",1] "muellner" "Leonard Muellner" ["9781565925809",1] "walsh" "Norman Walsh"
  38. 38. Authors and Related Books
  39. 39. Map Function function(doc) { if ("author" == doc.collection) { emit([doc._id, 0], doc.name); for (var i in doc.books) { emit([doc._id, 1], {"_id":doc.books[i]}); } } }
  40. 40. Result Set key id value ["muellner",0] "muellner" "Leonard Muellner" ["muellner",1] "muellner" {"_id":"9781565925809"} ["walsh",0] "walsh" "Norman Walsh" ["walsh",1] "walsh" {"_id":"9780596805029"} ["walsh",1] "walsh" {"_id":"9781565920514"} ["walsh",1] "walsh" {"_id":"9781565925809"}
  41. 41. Including Docs include_docs=true key id value doc (truncated) ["muellner",0] "muellner" … {"name":"Leonard Muellner"} ["muellner",1] "muellner" … {"title":"DocBook: The Definitive Guide"} ["walsh",0] "walsh" … {"name":"Norman Walsh"} ["walsh",1] "walsh" … {"title":"DocBook 5: The Definitive Guide"} ["walsh",1] "walsh" … {"title":"Making TeX Work"} ["walsh",1] "walsh" … {"title":"DocBook: The Definitive Guide"}
  42. 42. Or, we can reverse the references…
  43. 43. Example: Author { "_id":"muellner", "collection":"author", "name":"Leonard Muellner" }
  44. 44. Example: Author { "_id":"walsh", "collection":"author", "name":"Norman Walsh" }
  45. 45. Example: Book { "_id":"9780596805029", "collection":"book", "title":"DocBook 5: The Definitive Guide", "authors":[ "walsh" ] }
  46. 46. Example: Book { "_id":"9781565920514", "collection":"book", "title":"Making TeX Work", "authors":[ "walsh" ] }
  47. 47. Example: Book { "_id":"9781565925809", "collection":"book", "title":"DocBook: The Definitive Guide", "authors":[ "muellner", "walsh" ] }
  48. 48. Map Function function(doc) { if ("author" == doc.collection) { emit([doc._id, 0], doc.name); } if ("book" == doc.collection) { for (var i in doc.authors) { emit([doc.authors[i], 1], doc.title); } } }
  49. 49. Result Set key id value ["muellner",0] "muellner" "Leonard Muellner" ["muellner",1] "9781565925809" "DocBook: The Definitive Guide" ["walsh",0] "walsh" "Norman Walsh" ["walsh",1] "9780596805029" "DocBook 5: The Definitive Guide" ["walsh",1] "9781565920514" "Making TeX Work" ["walsh",1] "9781565925809" "DocBook: The Definitive Guide"
  50. 50. Limitations Queries from the “right” side of the relationship cannot include any data from entities on the “left” side of the relationship (without the use of include_docs) A document representing an entity with lots of relationships could become quite large
  51. 51. Relationship Documents: Create a document to represent each individual relationship
  52. 52. Relationship Documents A document representing each “many” entity on the “left” side of the relationship Separate documents for each “many” entity on the “right” side of the relationship Neither the “left” nor “right” side of the relationship contain any direct references to each other For each distinct relationship, a separate document includes the document identifiers for both the “left” and “right” sides of the relationship
  53. 53. Example: Book { "_id":"9780596805029", "collection":"book", "title":"DocBook 5: The Definitive Guide" }
  54. 54. Example: Book { "_id":"9781565920514", "collection":"book", "title":"Making TeX Work" }
  55. 55. Example: Book { "_id":"9781565925809", "collection":"book", "title":"DocBook: The Definitive Guide" }
  56. 56. Example: Author { "_id":"muellner", "collection":"author", "name":"Leonard Muellner" }
  57. 57. Example: Author { "_id":"walsh", "collection":"author", "name":"Norman Walsh" }
  58. 58. Example: Relationship Document { "_id":"44005f2c", "collection":"book-author", "book":"9780596805029", "author":"walsh" }
  59. 59. Example: Relationship Document { "_id":"44005f72", "collection":"book-author", "book":"9781565920514", "author":"walsh" }
  60. 60. Example: Relationship Document { "_id":"44006720", "collection":"book-author", "book":"9781565925809", "author":"muellner" }
  61. 61. Example: Relationship Document { "_id":"44006b0d", "collection":"book-author", "book":"9781565925809", "author":"walsh" }
  62. 62. Books and Related Authors
  63. 63. Map Function function(doc) { if ("book" == doc.collection) { emit([doc._id, 0], doc.title); } if ("book-author" == doc.collection) { emit([doc.book, 1], {"_id":doc.author}); } }
  64. 64. Result Set key id value ["9780596805029",0] "9780596805029" "DocBook 5: The Definitive Guide" ["9780596805029",1] "44005f2c" {"_id":"walsh"} ["9781565920514",0] "9781565920514" "Making TeX Work" ["9781565920514",1] "44005f72" {"_id":"walsh"} ["9781565925809",0] "9781565925809" "DocBook: The Definitive Guide" ["9781565925809",1] "44006720" {"_id":"muellner"} ["9781565925809",1] "44006b0d" {"_id":"walsh"}
  65. 65. Including Docs include_docs=true key id value doc (truncated) ["9780596805029",0] … … {"title":"DocBook 5: The Definitive Guide"} ["9780596805029",1] … … {"name":"Norman Walsh"} ["9781565920514",0] … … {"title":"Making TeX Work"} ["9781565920514",1] … … {"author","name":"Norman Walsh"} ["9781565925809",0] … … {"title":"DocBook: The Definitive Guide"} ["9781565925809",1] … … {"name":"Leonard Muellner"} ["9781565925809",1] … … {"name":"Norman Walsh"}
  66. 66. Authors and Related Books
  67. 67. Map Function function(doc) { if ("author" == doc.collection) { emit([doc._id, 0], doc.name); } if ("book-author" == doc.collection) { emit([doc.author, 1], {"_id":doc.book}); } }
  68. 68. Result Set key id value ["muellner",0] "muellner" "Leonard Muellner" ["muellner",1] "44006720" {"_id":"9781565925809"} ["walsh",0] "walsh" "Norman Walsh" ["walsh",1] "44005f2c" {"_id":"9780596805029"} ["walsh",1] "44005f72" {"_id":"9781565920514"} ["walsh",1] "44006b0d" {"_id":"9781565925809"}
  69. 69. Including Docs include_docs=true key id value doc (truncated) ["muellner",0] … … {"name":"Leonard Muellner"} ["muellner",1] … … {"title":"DocBook: The Definitive Guide"} ["walsh",0] … … {"name":"Norman Walsh"} ["walsh",1] … … {"title":"DocBook 5: The Definitive Guide"} ["walsh",1] … … {"title":"Making TeX Work"} ["walsh",1] … … {"title":"DocBook: The Definitive Guide"}
  70. 70. Limitations Queries can only contain data from the “left” or “right” side of the relationship (without the use of include_docs) Maintaining relationship documents may require more work
  71. 71. Final Thoughts
  72. 72. Document Databases Compared to Relational Databases Document databases have no tables (and therefore no columns) Indexes (views) are queried directly, instead of being used to optimize more generalized queries Result set columns can contain a mix of logical data types No built-in concept of relationships between documents Related entities can be embedded in a document, referenced from a document, or both
  73. 73. Caveats No referential integrity No atomic transactions across document boundaries Some patterns may involve denormalized (i.e. redundant) data Data inconsistencies are inevitable (i.e. eventual consistency) Consider the implications of replication—what may seem consistent with one database may not be consistent across nodes (e.g. referencing entities that don’t yet exist on the node)
  74. 74. Additional Techniques Use the startkey and endkey parameters to retrieve one entity and its related entities: startkey=["9781565925809"]&endkey=["9781565925809",{}] Define a reduce function and use grouping levels Use UUIDs rather than natural keys for better performance Use the bulk document API when writing Relationship Documents When using the List of Keys or Relationship Documents patterns, denormalize data so that you can have data from the “right” and “left” side of the relationship within your query results
  75. 75. Cheat Sheet Embedded Related Relationship List of Keys Entities Documents Documents One to Many ✓ ✓ Many to Many ✓ ✓ <= N* Relations ✓ ✓ > N* Relations ✓ ✓ * where N is a large number for your system
  76. 76. http://oreilly.com/catalog/9781449303129/ http://oreilly.com/catalog/9781449303433/
  77. 77. Thank You @BradleyHolt http://bradley-holt.com bradley.holt@foundline.com Copyright © 2011-2012 Bradley Holt. All rights reserved.

Notes de l'éditeur

  • \n
  • \n
  • \n
  • \n
  • A full outer join effectively combines both left and right outer joins. If your relational database doesn&amp;#x2019;t support full outer joins then a left outer join is &amp;#x201C;close enough&amp;#x201D; for the following examples.\n
  • Entities are joined together in a single row.\n
  • Entities are joined together in a single row.\n
  • Entities are joined together in a single row.\n
  • Entities are collated together, but in separate rows.\nNote the use of compound keys.\n
  • Entities are collated together, but in separate rows.\nNote the use of compound keys.\n
  • Entities are collated together, but in separate rows.\nNote the use of compound keys.\n
  • Result set may also include a doc column if include_docs is set to true.\n
  • Result set may also include a doc column if include_docs is set to true.\n
  • Result set may also include a doc column if include_docs is set to true.\n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • The &amp;#x201C;0&amp;#x201D; and &amp;#x201C;1&amp;#x201D; make publisher sort before the publisher&amp;#x2019;s books.\nNote the use of compound keys.\n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • Note that the keys are the same as with the embedded document approach, but the IDs are different.\n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • Note that the best we can do is emit the book IDs, as we don&amp;#x2019;t have access to any other book data.\n
  • \n
  • Note that it includes the doc having the referenced ID, not the doc from which the row was emitted.\nNote that the docs are truncated.\n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • Note that none of the entity documents contain any references to other entities.\n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • Note that the docs are truncated.\n
  • \n
  • \n
  • \n
  • Note that the docs are truncated.\n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • \n
  • Note that these are trade-offs that provide associated benefits.\n
  • Note that these are trade-offs that provide associated benefits.\n
  • Note that these are trade-offs that provide associated benefits.\n
  • Note that these are trade-offs that provide associated benefits.\n
  • Note that these are trade-offs that provide associated benefits.\n
  • Note that the startkey and endkey parameters need to be URL encoded.\nNote that one must account for the &amp;#x201C;left&amp;#x201D; entity when using grouping levels.\nNote that UUIDs are especially useful for Relationship Documents.\nNote that the bulk document API is not transactional!\n
  • Note that the startkey and endkey parameters need to be URL encoded.\nNote that one must account for the &amp;#x201C;left&amp;#x201D; entity when using grouping levels.\nNote that UUIDs are especially useful for Relationship Documents.\nNote that the bulk document API is not transactional!\n
  • Note that the startkey and endkey parameters need to be URL encoded.\nNote that one must account for the &amp;#x201C;left&amp;#x201D; entity when using grouping levels.\nNote that UUIDs are especially useful for Relationship Documents.\nNote that the bulk document API is not transactional!\n
  • Note that the startkey and endkey parameters need to be URL encoded.\nNote that one must account for the &amp;#x201C;left&amp;#x201D; entity when using grouping levels.\nNote that UUIDs are especially useful for Relationship Documents.\nNote that the bulk document API is not transactional!\n
  • Note that the startkey and endkey parameters need to be URL encoded.\nNote that one must account for the &amp;#x201C;left&amp;#x201D; entity when using grouping levels.\nNote that UUIDs are especially useful for Relationship Documents.\nNote that the bulk document API is not transactional!\n
  • \n
  • \n
  • \n

×