As more web developers strive to make their applications scalable we see a shift away from the traditional LAMP stack towards technologies built with a focus on scaling. As part of this shift, a new approach to data storage for the web is needed – the traditional RDBMS are not suited to many of the problems that appear in large scale web applications. Fortunately, a large number of alternatives to the RDBMS have sprung up, each with different goals and approaches to the problem of scalability.
Axa Assurance Maroc - Insurer Innovation Award 2024
NoSQL - the Shift to a Non-Relational World
1. NoSQL: The Shift to
a Non-relational
Dwight Merriman
10gen / MongoDB
2. The database world is changing
no longer one-size-fits-all
RDBMS
(Oracle, MySQL)
Non-relational
New gen. OLAP
(vertica, aster, greenplum) operational stores
(“NoSQL”)
3. The Web Domain
Distributed & Unpredictable
Big data
Photos, videos, huge numbers of users
Not all data created equal
high-value (credit cards, transactions)
low-value (analytics, logs, tweets?)
Nimbleness Critical
agile development
new programming models
7. Scaling out - CAP
A
Amazon Dynamo
Inspired
(Voldemort, Cassandra, ...)
C P
Google BigTable / Paxos
Inspired
(MongoDB, Hypertable, HBase, ...)
8. Scaling out
distribution & query models
consistent hashing
range chunking
order preserving
scatter / gather
9. Data models
no joins +
light transactional semantics =
horizontally scalable architectures
important side effect :
new data models =
improved ways to develop
applications
13. MongoDB at Business Insider
600k pageviews / day
3 apache servers
1 database server at 5%
MongoDB data includes
posts, comments
site settings
real-time analytics
images
Ian - “We’re using LAMP - Linux, Apache, MongoDB,
PHP”
http://www.businessinsider.com/how-we-use-mongodb-2009-11
14. Business Insider Data Model
{ title: ‘Too Big to Fail’,
author: ‘John S’,
ts: Date(“05-Nov-09 10:33”),
[ comments: [ { author: 'Ian White',
comment: 'Great article!' },
{ author: 'Joe Smith',
comment: 'But how fast is it?',
replies: [ {author: 'Jane Smith',
comment: 'scalable?' } ]
}
]
],
tags: [‘finance’, ‘economy’]
}
15. MongoDB Query Example
cursor =
db.posts.find({tags : ‘economy’}).sort({ts:-1}).limit(10);
{ title: ‘Too Big to Fail’,
author: ‘John S’,
ts: Date(“05-Nov-09 10:33”),
[ comments: [ { author: 'Ian White',
comment: 'Great article!' },
{ author: 'Joe Smith',
comment: 'But how fast is it?',
replies: [ {author: 'Jane Smith',
comment: 'scalable?' } ]
}
]
],
tags: [‘finance’, ‘economy’]
}
16. Advantages vs. MySQL
“Schemaless”
Fast writes
analytics: 3-8 upserts per pageview
Fast reads
little caching necessary
Binary storage
storing images in the db itself
Easy development, scalability
17. Prediction
within 12 months, majority of new
web infrastructure projects will use a
non-relational db as their primary
data store