4. Relational DB
• In the *70’s
• SQL ,relational algebra & set theory
• excellent for applications such as management
( accounting, reservations, management staff)
5. ACID
Transactions work in the right mode if the
database can satisfy this four properties:
• Atomic
• Consistency
• Isolation
• Durability
11. Historical Intro
The concept of “non relational database” is
older than the “relational model” but has been
resumed and improved
technology comes back
15. New Requirements
half *90’s
with the new internet-based systems the
Consistency and the Security of data are no
longer enough
the new need is the Hight availability
16. Google
• distributed storage system
• scale file dimension up to Petabyte
Wide applicability
Scalability
High performance
High availability
17. Google BigTable
column - Oriented DB
• Web indexing
• Google Earth
• Google Finance
• Orkut
• Custom Search
• Google Docs
18. Amazon
• Relational model doesn’t fit requirements
• 10 of thousand of server around the world
• 10 Millions customers
High Reliability
High scale
19. Amazon Dynamo
Key-Value Store Database
• High Reliability
• High Scale
21. Web Company
• Startup with explosive growth:
• DBMS open source
• v 1.0 - 1 node , becomes soon inadequate
• next version:
• Horizontal Partitioning (sharding)
• implement the node routing inside the
application logic
22. Web Company
• Re-implement inter-node query
• Handle inter-node transaction
• Node failure increasingly likely - less reliability -
less availability
• “Hot” Data restructuring and data redistribuition
becomes hard
23. Solution
• Scalability, very simple operations,
}
but on many nodes
• Performance, low latency
web
• Productivity
Application
• Flexibility (data structure) needs
• Skill to distribute data on many
nodes
25. Query Language
Leave a standard query language like SQL, and
embrace a different kind of query language based
on the selected product
• SQL like
• map-reduce
• SparQL
• ...
26. CAP Theorem(2009)
• Consistency
• Availability
• Partition Tollerance Eric Brewer
It’s impossibile to have all of them at the same
time in a distributed system.You have to choose
only two.
27. Consistency
• Strong: After the update completes any
subsequent access will return the updated
N1
value.
• Weak: The system does not guarantee that
subsequent accesses will return the updated
N2
value. tk
tk
• Eventually: The storage system guarantees that N6
if no new updates are made to the object tk
eventually (after the inconsistency window
tk
closes) all accesses will return the last N5
updated value. N4
28. Consistency
• Strong: After the update completes any
subsequent access will return the updated
N1
value.
• Weak: The system does not guarantee that
subsequent accesses will return the updated
N2
value. tk
tk
• Eventually: The storage system guarantees that N6
if no new updates are made to the object tk
eventually (after the inconsistency window
tk
closes) all accesses will return the last N5
updated value. N4
29. Facebook Cassandra
• Key-Value store
• data model: BigTable
• infrastructure: Amazon-Dynamo
• Eventual Consistency
• High Availability
38. Key Value Store
One Key -> One Value
it’s like an HASH
db knows information about “key” type
(integer, float, ...), nothing about the value
very fast
‘name’ => ‘david’
key value
39. Key Value Store
performance high
Scalability high
• redis
• memcached
Flexibility high
• dynamo
Complexity none
• voldemort
Functionality variabile(none)
56. OrientDB library for
PHP
https://github.com/congow/Orient
A Set of tools to use and manage any OrientDB
instance from PHP.
Orient includes:
•the HTTP protocol binding
•the query builder
•the data mapper ( Object Graph Mapper )