Ce diaporama a bien été signalé.
Le téléchargement de votre SlideShare est en cours. ×

Relational and non relational database 7

Chargement dans…3

Consultez-les par la suite

1 sur 40 Publicité

Plus De Contenu Connexe

Diaporamas pour vous (20)

Similaire à Relational and non relational database 7 (20)


Plus récents (20)

Relational and non relational database 7

  1. 1. Relational and Non- Relational Database Dr. Zainab Namh Lect. 9 Relational vs Non – Relational Database
  2. 2. Introduction to NoSQL? 2
  3. 3. NoSQL • NoSQL is a non-relational DMS, that does not require a fixed schema, avoids joins, and is easy to scale. • NoSQL database is used for distributed data stores with humongous data storage needs. • NoSQL is used for Big data and real-time web apps. For example, companies like Twitter, Facebook, Google that collect terabytes of user data every single day. • NoSQL database stands for "Not Only SQL" or "Not SQL“ Though a better term would NoREL. • Carl Strozz introduced the NoSQL concept in 1998. 3
  4. 4. What is Big Data? • A term for data sets that are so large that traditional methods storage and processing are inadequate. • Massive increase in data volume within the last decade or so • Social networks, search engines, etc. • Challenges in storage, capture, analysis, transfer, etc. 4
  5. 5. Advantages and Limitations of RDBMS • Better for relational data • Normalization • Well known standard language (SQL) • ACID Compliance However • No flexibility in choosing the right DB for the right application • Expensive if runs across a cluster • Integrity is more important than availability and low latency 5
  6. 6. Why NoSQL? • The concept of NoSQL databases became popular with Internet giants like Google, Facebook, Amazon, etc. who deal with huge volumes of data. The system response time becomes slow when you use RDBMS for massive volumes of data. • To resolve this problem, we could "scale up" our systems by upgrading our existing hardware. This process is expensive. • The alternative for this issue is to distribute database load on multiple hosts whenever the load increases. This method is known as "scaling out." 6
  7. 7. Horizontal vs Vertical scaling 7
  8. 8. Features of NoSQL • Non-relational • NoSQL databases never follow the relational model • Never provide tables with flat fixed-column records • Work with self-contained aggregates • Doesn't require data normalization • No complex features like query languages, referential integrity joins and ACID • Schema-free • NoSQL databases are either schema-free or have relaxed schemas • Do not require any sort of definition of the schema of the data • Offers heterogeneous structures of data in the same domain 8
  9. 9. NoSQL – Data Structure 9
  10. 10. Features of NoSQL • Distributed • Multiple NoSQL databases can be executed in a distributed fashion • Offers auto-scaling and fail-over capabilities • Often ACID concept can be sacrificed for scalability and throughput 10
  11. 11. What is the CAP Theorem? • CAP theorem is also called brewer's theorem. It states that is impossible for a distributed data store to offer more than two out of three guarantees: • Consistency • Availability • Partition Tolerance • As soon as you partition a DB the data might either: • Become inconsistent – one replication holds different data from another • Become unavailable – to keep consistency, we must propagate updates, which takes time 11
  12. 12. What is the CAP Theorem? • Consistency: • The data should remain consistent even after the execution of an operation. • This means once data is written, any future read request should contain that data. • For example, after updating the order status, all the clients should be able to see the same data. 12
  13. 13. What is the CAP Theorem? • Availability: • The database should always be available and responsive. It should not have any downtime. • Partition Tolerance: • Partition Tolerance means that the system should continue to function even if the communication among the servers is not stable. • For example, the servers can be partitioned into multiple groups which may not communicate with each other. • Here, if part of the database is unavailable, other parts are always unaffected. 13
  14. 14. Eventual Consistency - BASE • BASE: Basically Available, Soft state, Eventual consistency • Basically available means DB is available all the time as per CAP theorem • Soft state means even without an input; the system state may change • Eventual consistency means that the system will become consistent over time 14
  15. 15. Eventual Consistency Example • If you pay some money into your bank account, you don’t want some cash machines to know its there and others not to • But if you post something on Facebook, you might put up with the fact that a friend in the UK can see it for a moment while an American friend, looking at a slightly older replication cannot. He’ll see it eventually 15
  16. 16. ACID vs BASE 16
  17. 17. NoSQL database types • There are mainly four categories of NoSQL databases. Each of these categories has its unique attributes and limitations: 1. Key-value Pair Based 2. Column-oriented 3. Graphs based 4. Document-oriented • No specific database is better to solve all problems. You should select a database based on your product needs. 17
  18. 18. Key Value Store NoSQL Database • Data is stored in key/value pairs. It is designed in such a way to handle lots of data and heavy load. • Key-value pair storage databases store data as a hash table where each key is unique, and the value can be a JSON (JavaScript Object Notation), BLOB(Binary Large Objects), string, etc. • For example, a key-value pair may contain a key like "Website" associated with a value like "Guru99". • When we try and reflect back on the CAP theorem, it becomes quite clear that key value stores are great around the Availability and Partition aspects but definitely lack in Consistency. 18
  19. 19. Example 19
  20. 20. Column-based NoSQL Database • Column stores in NoSQL are similar at first appearance to traditional relational DBMS. The concepts of rows and columns are still there. You also define column families before loading data into the database, meaning that the structure of data must be known in advance. • However, column stores organize data differently than relational databases do. Instead of storing data in a row for fast access, data is organized for fast column operations. • They deliver high performance on aggregation queries like SUM, COUNT, AVG, MIN etc. as the data is readily available in a column. 20
  21. 21. Column-based NoSQL Database 21
  22. 22. Document Store NoSQL Database • Document-Oriented NoSQL DB stores and retrieves data as a key value pair but the value part is stored as a document. The document is stored in JSON or XML formats. 22
  23. 23. NoSQL - Example 23
  24. 24. NoSQL - Example 24
  25. 25. Graph-Based NoSQL Database • A graph type database stores entities as well the relations amongst those entities. The entity is stored as a node with the relationship as edges. An edge gives a relationship between nodes. Every node and edge has a unique identifier. 25
  26. 26. Disadvantages of NoSQL • No standardization rules • Limited query capabilities • RDBMS databases and tools are comparatively mature • It does not offer any traditional database capabilities, like consistency when multiple transactions are performed simultaneously. • When the volume of data increases it is difficult to maintain unique values as keys become difficult • Doesn't work as well with relational data 26
  27. 27. NoSQL Example – Review DB Product: iPhone 5 Price: £500 Camera: Fine Screen: Very good Accessories: Headphone, Case, .... Product: iPhone 5 Price: £500 Camera: Excellent Screen: Poor in sunshine Operating system: Easy to use 27
  28. 28. Example Review in JSON { Id:847543, Name:iPhone5, Features:[GPS,Retina Display,Siri], Reviews:[ {Reviewer:458743,Date:12.4.1013,Speed:Slow}, {Reviewer:636534,Date:2.5.1013,Camera:Great}, ] } 28
  29. 29. Possible Review DB Design • For our product review database, potential units to aggregate are: – Users – keep all a person’s reviews together – Products – keep all the reviews of a product together 29
  30. 30. Review Queries? • What are Common Queries? • Which design you choose depends on a few things, for example, what is the most common query? – List all reviews from a given customer – List all reviews of a given product – Find all products that are ‘fast’ 30
  31. 31. Transaction Concurrency ACID Recovery 31
  32. 32. ACID- Atomic, Consistent, Isolated, Durable • Atomicity: A transaction is an atomic unit of processing; it is either performed in its entirety or not performed at all. • Consistency preservation: A transaction is consistency preserving if its complete execution take(s) the database from one consistent state to another. • Isolation: A transaction should appear as though it is being executed in isolation from other transactions. That is, the execution of a transaction should not be interfered with by any other transactions executing concurrently. • Durability or permanency: The changes applied to the database by a committed transaction must persist in the database. These changes must not be lost because of any failure. 32
  33. 33. What is a Transaction? • A transaction is a logical unit of database processing (or a logical unit of work on the database) that includes one or more database access operations: – insertion, – deletion, – modification, or – retrieval operations. 33
  34. 34. What is a Transaction? • If the database operations in a transaction do not update the database but only retrieve data, the transaction is called a read-only transaction. • Read-only transaction - do not changes the state of a database, only retrieves data. • A database system interprets a transaction not as an application program but a logical sequence of read and write operations. 34
  35. 35. Concurrent and Interleaving • Multiprogramming operating systems execute some commands from one process, then suspend this process and execute some commands from the next process, and so on. This type of process execution is called interleaving. • After a while, the execution of the first process is resumed at the point where it was suspended. • Interleaving keeps the CPU busy when a process requires an input or output operation. The CPU is switched to execute another process rather than remaining idle during I/O time. • Interleaving also prevents a long process from delaying other processes. 35
  36. 36. Why Interleave? 36
  37. 37. Why concurrency control is needed? 37
  38. 38. Why Recovery Is Needed? • Whenever a transaction is submitted to a DBMS for execution, the system is responsible for making sure that either (1) all the operations in the transaction are completed successfully and their effect is recorded permanently in the database, or (2) the transaction has no effect whatsoever on the database or on any other transactions. • For the purpose of recovery and concurrency control, we are mainly interested in the read_item and write_item operations of the transactions, as well as the commit and abort operations. 38
  39. 39. Successful Schedule 39
  40. 40. References • https://www.guru99.com/nosql-tutorial.html • Introduction to NoSQL – Kevin Swingler – 2017. • Advanced Database Lectures – Prof. Musbah Aqel – 2011. 40