3. Agenda
• Some database theory
• Data Modelling in SQL databases
• ACID transactions
• Why NoSQL?
• Data Modelling in NoSQL databases
• CAP theorem
4. Database and
its Types
• A database is an organized collection of
data stored and accessed electronically.
Small databases can be stored on a file
system, while large databases are hosted
on computer clusters or cloud storage
• Types of databases- Relational (SQL
DBs) and Non-Relational (NoSQL DBs)
Relational Databases Non-Relational Databases
6. ACID Transactions
Atomic: All operations in a transaction will
succeed or every operation has to roll back.
Consistent : On the completion of a transaction,
the database is structurally sound.
Isolated: Any two transactions are not
interfering and appear to run sequentially.
Durable: Result of applying a transaction is
permanent even in case of a failure.
Because of ACID properties , Relational DBs are
used with applications which require high accuracy
and consistency eg Retail and Financial applications
7. Data Modeling in Relational Databases
Conceptual
Data Model
Logical
Data Model
Physical
Data Model
EDW
Mart
Mart
OLTP
OLTP
OLTP
OLTP
OLTP
10. Why NoSQL?
• Data Format- NoSQL databases support wide variety
of very large complex, semi-structured or
unstructured data.
• Performance – The schema of RDBMS is highly
normalized and requires the use of multiple joins,
which doesn’t performs well with large amount of
data.
• Scalability - Existing RDBMS solutions require scale
up, which is limited and not really scalable when
dealing with exponential growth of data.
• Availability – NoSQL databases are highly available
even in case of power failures due to implementation
of distributed systems.
• Accommodating - The schema in NoSQL databases is
not fixed and pre-defined. It depends on the user
access patterns. NoSQL databases can easily
accommodate frequent changes in data structure.
13. MongoDB- Key Concepts
• Data stored in JSON like documents
• A MongoDB Database contains collections and each collection
contains documents
• Unlike RDBMS, a pre-defined schema for a collection is optional,
hence flexible data structures.
• Maintains backup copies of the database instance
18. Linking v/s Embedding?
• Embedding is storing the related data within a document
that is frequently accessed together. This is also called
denormalized data model.
• Linking, also known as referencing means referencing data
of one collection into another. This is also called
normalized data model.
19. Data Modeling in DynamoDB
• DynamoDB is a fully managed database service on AWS, that can handle complex access patterns like time
series data or even geospatial data.
• Key Concepts-
Data model in the form of tables
Data stored in the form of items (key-value attributes)
Primary Key (mandatory Partition Key and optional Sort key )
Data Types- Scalar (number, string etc.) & Multi-valued (sets)
21. CAP Theorem
• Consistency - All users see the same
data at the same time.
• Availability – The system is going to
respond to every incoming request
with a success or failure.
• Partition Tolerance – The system
continues to function as expected even
in case of failure of a part of system.
22.
23. Summary
• SQL- works great, isn’t
scalable for large data
• NoSQL- works great,
isn’t suitable for
everyone
• SQL + NoSQL- Optimized
solution