Ce diaporama a bien été signalé.
Nous utilisons votre profil LinkedIn et vos données d’activité pour vous proposer des publicités personnalisées et pertinentes. Vous pouvez changer vos préférences de publicités à tout moment.
NoSQL, NO PROBLEM:
USING AZURE
DOCUMENTDB
{
"name": "Ken Cenerelli",
"twitter": "@KenCenerelli",
"e-mail": "Ken_Cenerelli@...
ABOUT ME
Twitter: @KenCenerelli
Email: Ken_Cenerelli@Outlook.com
Blog: kencenerelli.wordpress.com
LinkedI
n:
linkedin.com/...
ROAD MAP
1. Overview
2. The Resource Model
3. Modeling Your Data
4. Performance
5. Developing with DocumentDB & Demos {“ak...
WHAT IS NoSQL?
 NoSQL → Not Only SQL
 No up-front (schema) design
 Easier to scale horizontally
 Easier to develop ite...
WHAT IS AZURE DOCUMENTDB?
 NoSQL document database fully managed by Microsoft Azure
 Part of the NoSQL family of databas...
6
WHERE DOES IT FIT IN THE AZURE
FAMILY?
7
WHEN TO USE DOCUMENTDB?
 In General
 You don’t want to do replication and scale-out by yourself
 You want ACID transact...
WHEN TO NOT USE DOCUMENTDB?
 If your data has complex relationships
 If your data has rigid schemas
 If your data has c...
DOCUMENTDB USE CASES
 User generated content
 Blog posts, chat sessions, ratings, comments, feedback, polls
 Catalog da...
RESOURCE MODEL
11
JS
JS
JS
101
010
RESOURCE MODEL
JS
JS
JS
101
010
RESOURCE MODEL
JS
JS
JS
101
010
* collection != table of homogenous entities
collection ~ a data partition
RESOURCE MODEL
14
JS
JS
JS
101
010
{
"id" : "123"
"name" : "joe"
"age" : 30
"address" : {
"street" : "some st"
}
}
RESOURCE MODEL
15
JS
JS
JS
101
010
RESOURCE ADDRESSING
 Native REST Interface
 Each resource has a permanent unique ID
 API URL:
 https://{database accou...
DOCUMENTDB JSON DOCUMENTS
JSON
 Intersection of most
modern type systems
JSON values
 Self-describable,
self-contained v...
DATA MODELING WITH RDBMS
18
Doing it the RDBMS way: normalize everything!
To query for Person joins are needed to related
...
DATA MODELING WITH
DENORMALIZATION
19
{
"id": "1",
"firstName": "Thomas",
"lastName": "Andersen",
"addresses": [
{
"line1"...
DATA MODELING WITH
REFERENCING
20
In general, use normalized
data models when:
 Write performance is more
important than ...
HYBRID MODELS: DENORMALIZE +
REFERENCE
21
No magic bullet!
Think about how your data is
going to be written and read
then ...
DATA MODELLING TIPS
 Map properties to JSON types
 Prefer smaller documents (<16KB) for smaller footprint, less IO,
lowe...
TUNABLE CONSISTENCY
 Set at the account level
 Can be overridden at the query level
 Levels:
 Strong
 Session (defaul...
INDEXING
 Automatic indexing of documents and its properties when added
to the collection
 Instantly queryable by proper...
INDEXING POLICIES
25
Configuration Level Options
Automatic Per collection True (default) or False
Override with each docum...
DOCUMENTDB FOR DEVELOPERS
 Promotes code-first development
 Resilient to iterative schema changes
 Low impedance as obj...
QUERYING LIMITATION
 Within a collection
 Besides filtering, ORDER BY and TOP is supported
 No aggregation yet
 No COU...
DEMO TIME!
28
REQUEST UNITS
 DocumentDB unit of scale
 Throughput (in terms of rate of transactions / second)
 Measured in Request Un...
REQUEST UNITS
30
Request Unit (RU) is
the normalized
currency
%
Memory
% IOPS
% CPU
Replica gets a fixed
budget of Request...
NOT ALL REQUEST UNITS ARE
CREATED EQUALLY
31
PRICING
 Standard pricing tier with
hourly billing
 99.95% availability
 Adjustable performance
levels
 Collections ha...
LIMITATIONS & QUOTA
33
Entity Quota
Accounts 5 (soft)
DBs / Account 100
Document storage per collection 250 GB
Collections...
SUMMARY
 Collections != Tables
 De-normalize data where appropriate
 Tuning / Performance
 Consistency Levels
 Indexi...
DESIGNING A DOCUMENTDB APP
1.
2.
3.
4.
5.
6.





35
RESOURCES
 Query Playground: aka.ms/docdbplayground
 Data Import Tool: aka.ms/docdbimport
 Docs & Tutorials: aka.ms/doc...
QUESTIONS?
37
@KenCenerelli
Ken_Cenerelli@Outlook.
com
Please complete the session evaluation to win
prizes!
CLD101: NoSQL...
38Credit:
Prochain SlideShare
Chargement dans…5
×

No SQL, No Problem: Use Azure DocumentDB

697 vues

Publié le

Introduction to Microsoft Azure DocumentDB. The slides have sections on Overview, Resource Model, Data Modeling, Performance, Development, Pricing and DocumentDB resources.

This talk was given at the following locales:
- DevTeach Montreal (July 6, 2016)

Publié dans : Technologie
  • Soyez le premier à commenter

No SQL, No Problem: Use Azure DocumentDB

  1. 1. NoSQL, NO PROBLEM: USING AZURE DOCUMENTDB { "name": "Ken Cenerelli", "twitter": "@KenCenerelli", "e-mail": "Ken_Cenerelli@Outlook.com", "hashtags": ["#DevTeach", "#DocumentDB"] }
  2. 2. ABOUT ME Twitter: @KenCenerelli Email: Ken_Cenerelli@Outlook.com Blog: kencenerelli.wordpress.com LinkedI n: linkedin.com/in/kencenerelli Bio:  Content Developer / Programmer Writer  Microsoft MVP - Visual Studio and Development Technologies  Microsoft TechNet Wiki Guru  Co-Organizer of CTTDNUG  Technical reviewer of multiple booksCTTDNU G Ken Cenerelli 2
  3. 3. ROAD MAP 1. Overview 2. The Resource Model 3. Modeling Your Data 4. Performance 5. Developing with DocumentDB & Demos {“aka”: “The Good Stuff”} 6. Pricing 7. Wrap-up 3
  4. 4. WHAT IS NoSQL?  NoSQL → Not Only SQL  No up-front (schema) design  Easier to scale horizontally  Easier to develop iteratively  Types & Examples:  Document databases: DocumentDB, MongoDB, CouchDB  Key-value stores: Redis  Graph stores: Neo4J, Giraph  Wide-column: Cassandra, HBase 4
  5. 5. WHAT IS AZURE DOCUMENTDB?  NoSQL document database fully managed by Microsoft Azure  Part of the NoSQL family of databases  For rapid development of cloud-designed apps (web, mobile, gaming, IoT)  Store and query schema agnostic JSON data with SQL-like grammar  Fast, predictable performance  Transactionally process multiple documents via native JavaScript processing  Tunable consistency levels  Built with familiar tools – REST, JSON, JavaScript 5
  6. 6. 6
  7. 7. WHERE DOES IT FIT IN THE AZURE FAMILY? 7
  8. 8. WHEN TO USE DOCUMENTDB?  In General  You don’t want to do replication and scale-out by yourself  You want ACID transactions  You want to have tunable consistency  You want to do rapid development where models can evolve  You want to utilize your .NET, JavaScript and MongoDB skills  Compared to relational databases  You don’t want predefined columns  Compared to other document stores  You want to use a SQL-like grammar 8
  9. 9. WHEN TO NOT USE DOCUMENTDB?  If your data has complex relationships  If your data has rigid schemas  If your data has complex transactions  If your data needs aggregation  If your data needs encrypted storage  If you’re planning to move your entire data store to DocumentDB  If you do not want your data to be locked into Azure 9
  10. 10. DOCUMENTDB USE CASES  User generated content  Blog posts, chat sessions, ratings, comments, feedback, polls  Catalog data  User accounts, product catalogs, device registries for IoT  Logging and Time-series data  Event logs, input source for data analytics jobs performed offline  Gaming  In-game stats, social media integration, and high-score leaderboards  User preferences data  Modern web and mobile applications  IoT and Device sensor data  Ingest bursts of data from device sensors, ad-hoc querying and offline analytics 10
  11. 11. RESOURCE MODEL 11 JS JS JS 101 010
  12. 12. RESOURCE MODEL JS JS JS 101 010
  13. 13. RESOURCE MODEL JS JS JS 101 010 * collection != table of homogenous entities collection ~ a data partition
  14. 14. RESOURCE MODEL 14 JS JS JS 101 010 { "id" : "123" "name" : "joe" "age" : 30 "address" : { "street" : "some st" } }
  15. 15. RESOURCE MODEL 15 JS JS JS 101 010
  16. 16. RESOURCE ADDRESSING  Native REST Interface  Each resource has a permanent unique ID  API URL:  https://{database account}.documents.azure.com  Document Path:  /dbs/{database id}/colls/{collection id}/docs/{document id} 16
  17. 17. DOCUMENTDB JSON DOCUMENTS JSON  Intersection of most modern type systems JSON values  Self-describable, self-contained values  Are trivially serialized to/from text 17 { "locations": [ {"country": "Germany", "city": "Berlin"}, {"country": "France", "city": "Paris"}, ], "headquarters": "Belgium", "exports":[{"city"; "Moscow"},{"city: "Athens"}] }; a JSON document, as a tree Locations Headquarte rs Belgium Country City Country City Germany Berlin France Paris Exports CityCity Moscow Athens 0 10 1
  18. 18. DATA MODELING WITH RDBMS 18 Doing it the RDBMS way: normalize everything! To query for Person joins are needed to related tables: SELECT p.name, p.lastName, p.age, cd.detail, cdt.type, a.street, a.city, a.state, a.zip FROM Person p INNER JOIN Address a ON a.person_id = p.id INNER JOIN ContactDetail cd ON cd.person_id = p.id INNER JOIN ContactDetailType cdt ON cd.type_id = cdt.id multiple table updates
  19. 19. DATA MODELING WITH DENORMALIZATION 19 { "id": "1", "firstName": "Thomas", "lastName": "Andersen", "addresses": [ { "line1": "100 Some Street", "line2": "Unit 1", "city": "Seattle", "state": "WA", "zip": 98012 } ], "contactDetails": [ {"email: "thomas@andersen.com"}, {"phone": "+1 555 555-5555", "extension": 5555} ] } Try to model your entity as a self- contained document Generally, use embedded data models when:  contains  one-to-few  changes infrequently  won’t grow  integral better read performance
  20. 20. DATA MODELING WITH REFERENCING 20 In general, use normalized data models when:  Write performance is more important than read performance  Representing one-to-many relationships  Can representing many-to-many relationships  Related data changes frequently Provides more flexibility than embedding More round trips to read data { "id": "xyz", "username: "user xyz" } { "id": "address_xyz", "userid": "xyz", "address" : { … } } { "id: "contact_xyz", "userid": "xyz", "email" : "user@user.com" "phone" : "555 5555" } Normalizing typically provides better write performance
  21. 21. HYBRID MODELS: DENORMALIZE + REFERENCE 21 No magic bullet! Think about how your data is going to be written and read then model accordingly { "id": "1", "firstName": "Thomas", "lastName": "Andersen", "countOfBooks": 3, "books": [1, 2, 3], "images": [ {"thumbnail": "http://....png"} {"profile": "http://....png"} ] } { "id": 1, "name": "DocumentDB 101", "authors": [ {"id": 1, "name": "Thomas Andersen", "thumbnail": "http://....png"}, {"id": 2, "name": "William Wakefield", "thumbnail": "http://....png"} ] } Author document Book document
  22. 22. DATA MODELLING TIPS  Map properties to JSON types  Prefer smaller documents (<16KB) for smaller footprint, less IO, lower RU charges  Maximum size is 512KB – watch unbounded arrays leading to document bloat  Store metadata on attachments, reference binary data/free text as external links  Prefer sparse properties – skip rather than explicit null  Use fullName = "Azure DocumentDB" instead of firstName = "Azure" AND lastName = "DocumentDB" 22
  23. 23. TUNABLE CONSISTENCY  Set at the account level  Can be overridden at the query level  Levels:  Strong  Session (default option)  Bounded Staleness  Eventual 23 Strong consistency; slow write speeds Weak consistency; fast write speeds
  24. 24. INDEXING  Automatic indexing of documents and its properties when added to the collection  Instantly queryable by property using a SQL-like grammar  No need to define secondary indices / schema hints for indexing 24 Indexing Modes Consistent  Default mode  Index updated synchronously on writes Lazy  Useful for bulk ingestion scenarios Indexing Policies Automatic  Default Manual  Can manually opt- out of automatic indexing Indexing Types Hash  For equality queries  Strings and numbers Range  For comparison queries  Numbers
  25. 25. INDEXING POLICIES 25 Configuration Level Options Automatic Per collection True (default) or False Override with each document write Indexing Mode Per collection Consistent or Lazy Lazy for eventual updates/bulk ingestion Included and excluded paths Per path Individual path or recursive includes (? And *) Indexing Type Per path Support Hash (Default) and Range Hash for equality, range for range queries Indexing Precision Per path Supports 3 – 7 per path Tradeoff storage, query RUs and write RUs
  26. 26. DOCUMENTDB FOR DEVELOPERS  Promotes code-first development  Resilient to iterative schema changes  Low impedance as object / JSON store; no ORM required  Richer query and indexing  Has a REST API  Available SDKs and libraries:  .NET (LINQ to SQL is supported)  Node.js  JavaScript  Python  Java  JavaScript for server-side app logic 26
  27. 27. QUERYING LIMITATION  Within a collection  Besides filtering, ORDER BY and TOP is supported  No aggregation yet  No COUNT  No GROUP BY  No SUM, AVG, etc. SQL for queries only  No batch UPDATE or DELETE or CREATE 27
  28. 28. DEMO TIME! 28
  29. 29. REQUEST UNITS  DocumentDB unit of scale  Throughput (in terms of rate of transactions / second)  Measured in Request Units (RUs)  1 RU = throughput for a 1KB document/second  2,000 requests per second allowed  “Request” depends on the size of the document  For example, uploading 1,000 large JSON documents might count as more than one request  Max throughput per collection, measured in RUs per second per collection, is 250,000 RUs/second 29
  30. 30. REQUEST UNITS 30 Request Unit (RU) is the normalized currency % Memory % IOPS % CPU Replica gets a fixed budget of Request Units Resource Resource set Resource Resource DocumentsSQL sprocsargs Resource Resource Predictable Performance
  31. 31. NOT ALL REQUEST UNITS ARE CREATED EQUALLY 31
  32. 32. PRICING  Standard pricing tier with hourly billing  99.95% availability  Adjustable performance levels  Collections have 10 GB SSD room  Limit of 100 collections (1 TB) for each account – can be adjusted  http://bit.do/documentdb- pricing 32
  33. 33. LIMITATIONS & QUOTA 33 Entity Quota Accounts 5 (soft) DBs / Account 100 Document storage per collection 250 GB Collections / DB 100 (soft) Request document size 512 KB Permissions / Account 2M Stored Procedures, Triggers & UDFs / collection 25 Max Execution Time / Stored Procedure or Trigger 5 seconds ID Length 255 chars AND, OR / query 20 https://azure.microsoft.com/en- us/documentation/articles/documentdb-limits/
  34. 34. SUMMARY  Collections != Tables  De-normalize data where appropriate  Tuning / Performance  Consistency Levels  Indexing Policies  Understand Query Costs / Limits / Avoid Scans 34
  35. 35. DESIGNING A DOCUMENTDB APP 1. 2. 3. 4. 5. 6.      35
  36. 36. RESOURCES  Query Playground: aka.ms/docdbplayground  Data Import Tool: aka.ms/docdbimport  Docs & Tutorials: aka.ms/documentdb-docs  Code Samples: aka.ms/documentdb-samples  Cheat Sheet: aka.ms/docdbcheatsheet  Blog: aka.ms/documentdb-blog  Twitter: @documentdb 36
  37. 37. QUESTIONS? 37 @KenCenerelli Ken_Cenerelli@Outlook. com Please complete the session evaluation to win prizes! CLD101: NoSQL, No Problem: Use Azure DocumentDB
  38. 38. 38Credit:

×