SlideShare une entreprise Scribd logo
1  sur  30
Cloning Twitter With
Redis
Dr. Fabio Fumarola
Motivation
• The programming community considers that key-
value cannot be used as replacement for a relational
database.
• Here we show how a key-value layer is an effective
data model to implement many kinds of applications.
2
A Twitter Clone
• One of the most successful new Internet services of
recent times is Twitter.
• Since its launch it has exploded from niche usage to
usage by the general populace, with celebrities such
as Oprah Winfrey, Britney Spears, and Shaquille
O'Neal, and politicians such as Barack Obama and Al
Gore jumping into it.
3
Why Twitter?
• Simple: it does not care what you share, as a long it is less
than 140 characters
• A means to have public conversation: Twitter allows a user
to tweet and have users respond using '@' reply, comment,
or re-tweet
• Fan versus friend
• Understanding user behavior
• Easy to share through text messaging
• Easy to access through multiple devices and applications
4
Main Features
• Allow users to post status updates (known as
'tweets' in Twitter) to the public.
• Allow users to follow and unfollow other users. Users
can follow any other user but it is not reciprocal.
• Allow users to send public messages directed to
particular users using the @ replies convention (in
Twitter this is known as mentions)
5
Main Features
• Allow users to send direct messages to other users,
messages are private to the sender and the recipient
user only (direct messages are only to a single
recipient).
• Allow users to re-tweet or forward another user's
status in their own status update.
• Provide a public timeline where all statuses are
publicly available for viewing.
• Provide APIs to allow external applications access.
6
Redis CLI
7
Redis CLI
• Save a Key Value
– SET foo bar
• Get a value for a key
– GET foo => bar
• Del a key value
– DEL foo
8
Redis CLI
• Increment a value for a key
– SET foo 10
– INCR foo => 11
– INCR foo => 12
– INCR foo => 13
• INCR is an atomic operation
– x = GET foo
– x = x + 1
– SET foo x
9
Redis CLI
• The problem with this kind of operation is when
multiple client update the same key
– x = GET foo (yields 10)
– y = GET foo (yields 10)
– x = x + 1 (x is now 11)
– y = y + 1 (y is now 11)
– SET foo x (foo is now 11)
– SET foo y (foo is now 11)
10
Redis CLI: LIST
• Beyond key-value stores: lists
– LPUSH mylist a (now mylist holds 'a')
– LPUSH mylist b (now mylist holds 'b','a')
– LPUSH mylist c (now mylist holds 'c','b','a')
• LPUSH means Left Push
• There is also the operation RPUSH
• This is very useful for our Twitter clone. User updates
can be added to a list stored in username:updates,
for instance.
11
Redis CLI: LIST
• LRANGE returns a range from the list
– LRANGE mylist 0 1 => c,b
– LRANGE mylist 0 -1 => c,b,a
• The last-index argument can be negative, with a
special meaning: -1 is the last element of the list, -2
the penultimate, and so on
12
Redis CLI: SET
• SADD is the add to set operation
• SREM is the remove from set operation
• SINTER is the perform intersection operation
• SCARD to get the cardinality of a Set
• SMEMBERS to return all the members of a Set.
13
Redis CLI: SET
– SADD myset a
– SADD myset b
– SADD myset foo
– SADD myset bar
– SCARD myset => 4
– SMEMBERS myset => bar,a,foo,b
14
Redis CLI: Sorted SET
• Sorted Set commands are prefixed with Z. The
following is an example of Sorted Sets usage:
– ZADD zset 10 a
– ZADD zset 5 b
– ZADD zset 12.55 c
– ZRANGE zset 0 -1 => b,a,c
• In the above example we added a few elements with
ZADD, and later retrieved the elements with ZRANGE
15
Redis CLI: Sorted SET
• The elements are returned in order according to
their score.
• In order to check if a given element exists, and also
to retrieve its score if it exists, we use the ZSCORE
command:
– ZSCORE zset a => 10
– ZSCORE zset non_existing_element => NULL
16
Redis CLI: HASH
• Redis Hashes are basically like Ruby or Python
hashes, a collection of fields associated with values:
– HMSET myuser name Salvatore surname Sanfilippo
country Italy
– HGET myuser surname => Sanfilippo
17
Data Layout
18
Data Layout
• When working with a relational database, a database
schema must be designed so that we'd know the
tables, indexes, and so on that the database will
contain.
• We don't have tables in Redis, so what do we need
to design?
• We need to identify what keys are needed to
represent our objects and what kind of values this
keys need to hold.
19
Users
• We need to represent users, of course, with their
– username, userid, password, the set of users following a
given user, the set of users a given user follows, and so on.
• The first question is, how should we identify a user?
• Asolution is to associate a unique ID with every user.
• Every other reference to this user will be done by id.
– INCR next_user_id => 1000
– HMSET user:1000 username antirez password p1pp0
20
Users
• Besides the fields already defined, we need some
more stuff in order to fully define a User
• For example, sometimes it can be useful to be able
to get the user ID from the username, so every time
we add an user, we also populate the users key,
which is an Hash, with the username as field, and its
ID as value.
– HSET users antirez 1000
21
Users
– HSET users antirez 1000
• We are only able to access data in a direct way,
without secondary indexes.
• It's not possible to tell Redis to return the key that
holds a specific value.
• This new paradigm is forcing us to organize data so
that everything is accessible by primary key, speaking
in relational DB terms.
22
Followers, following and updates
• A user might have users who follow them, which
we'll call their followers.
• A user might follow other users, which we'll call a
following.
• We have a perfect data structure for this. That is...
Sorted Set.
23
Followers, following and updates
• So let's define our keys:
– followers:1000 => Sorted Set of uids of all the followers
users
– following:1000 => Sorted Set of uids of all the following
users
• We can add new followers with:
– ZADD followers:1000 1401267618 1234 => Add user 1234
with time 1401267618
24
Followers, following and updates
• Another important thing we need is a place were we
can add the updates to display in the user's home
page.
• We'll need to access this data in chronological order
later
• Basically every new update will be LPUSHed in the
user updates key, and thanks to LRANGE, we can
implement pagination and so on.
25
Followers, following and updates
• Note, we use the words updates and posts
interchangeably, since updates are actually "little
posts" in some way.
– posts:1000 => a List of post ids - every new post is
LPUSHed here.
• This list is basically the User timeline.
• We'll push the IDs of her/his own posts, and, the IDs
of all the posts of created by the following users.
Basically we implement a write fanout.
26
Following Users
• We need to create following / follower relationships.
– If user ID 1000 (antirez) wants to follow user ID 5000
(pippo), we need to create both a following and a follower
relationship.
• We just need to ZADD calls:
– ZADD following:1000 5000
– ZADD followers:5000 1000
27
Following Users
• Note the same pattern again and again.
• In theory with a relational database the list of
following and followers would be contained in a
single table with fields like following_id and
follower_id.
• With a key-value DB things are a bit different since
we need to set both the 1000 is following 5000 and
5000 is followed by 1000 relations.
28
Making it horizontally scalable
• Our clone is extremely fast, without any kind of
cache.
• On a very slow and loaded server, an apache
benchmark with 100 parallel clients issuing 100000
requests measured the average pageview to take 5
milliseconds.
• This means we can serve millions of users every day
with just a single Linux box.
29
Making it horizontally scalable
• However you can't go with a single server forever,
how do you scale a key-value store?
• It does not perform any multi-keys operation, so
making it scalable is simple:
1. you may use client-side sharding,
2. or something like a sharding proxy like Twemproxy,
3. or the upcoming Redis Cluster.
30

Contenu connexe

Tendances

The inner workings of Dynamo DB
The inner workings of Dynamo DBThe inner workings of Dynamo DB
The inner workings of Dynamo DBJonathan Lau
 
Cross-Site BigTable using HBase
Cross-Site BigTable using HBaseCross-Site BigTable using HBase
Cross-Site BigTable using HBaseHBaseCon
 
MySQL database replication
MySQL database replicationMySQL database replication
MySQL database replicationPoguttuezhiniVP
 
SQL Server 2014 In-Memory OLTP
SQL Server 2014 In-Memory OLTPSQL Server 2014 In-Memory OLTP
SQL Server 2014 In-Memory OLTPTony Rogerson
 
Mysql database basic user guide
Mysql database basic user guideMysql database basic user guide
Mysql database basic user guidePoguttuezhiniVP
 
Apache HBase for Architects
Apache HBase for ArchitectsApache HBase for Architects
Apache HBase for ArchitectsNick Dimiduk
 
Introduction to PostgreSQL
Introduction to PostgreSQLIntroduction to PostgreSQL
Introduction to PostgreSQLJim Mlodgenski
 
New Security Features in Apache HBase 0.98: An Operator's Guide
New Security Features in Apache HBase 0.98: An Operator's GuideNew Security Features in Apache HBase 0.98: An Operator's Guide
New Security Features in Apache HBase 0.98: An Operator's GuideHBaseCon
 
Cassandra Explained
Cassandra ExplainedCassandra Explained
Cassandra ExplainedEric Evans
 
Cassandra 2.1 boot camp, Read/Write path
Cassandra 2.1 boot camp, Read/Write pathCassandra 2.1 boot camp, Read/Write path
Cassandra 2.1 boot camp, Read/Write pathJoshua McKenzie
 
Mongodb - NoSql Database
Mongodb - NoSql DatabaseMongodb - NoSql Database
Mongodb - NoSql DatabasePrashant Gupta
 
Analyze corefile and backtraces with GDB for Mysql/MariaDB on Linux - Nilanda...
Analyze corefile and backtraces with GDB for Mysql/MariaDB on Linux - Nilanda...Analyze corefile and backtraces with GDB for Mysql/MariaDB on Linux - Nilanda...
Analyze corefile and backtraces with GDB for Mysql/MariaDB on Linux - Nilanda...Mydbops
 
Redis Functions, Data Structures for Web Scale Apps
Redis Functions, Data Structures for Web Scale AppsRedis Functions, Data Structures for Web Scale Apps
Redis Functions, Data Structures for Web Scale AppsDave Nielsen
 
Intro to HBase Internals & Schema Design (for HBase users)
Intro to HBase Internals & Schema Design (for HBase users)Intro to HBase Internals & Schema Design (for HBase users)
Intro to HBase Internals & Schema Design (for HBase users)alexbaranau
 

Tendances (20)

The inner workings of Dynamo DB
The inner workings of Dynamo DBThe inner workings of Dynamo DB
The inner workings of Dynamo DB
 
Nov 2011 HUG: Blur - Lucene on Hadoop
Nov 2011 HUG: Blur - Lucene on HadoopNov 2011 HUG: Blur - Lucene on Hadoop
Nov 2011 HUG: Blur - Lucene on Hadoop
 
Intro to HBase
Intro to HBaseIntro to HBase
Intro to HBase
 
Cross-Site BigTable using HBase
Cross-Site BigTable using HBaseCross-Site BigTable using HBase
Cross-Site BigTable using HBase
 
MySQL database replication
MySQL database replicationMySQL database replication
MySQL database replication
 
SQL Server 2014 In-Memory OLTP
SQL Server 2014 In-Memory OLTPSQL Server 2014 In-Memory OLTP
SQL Server 2014 In-Memory OLTP
 
Mysql database basic user guide
Mysql database basic user guideMysql database basic user guide
Mysql database basic user guide
 
Apache HBase for Architects
Apache HBase for ArchitectsApache HBase for Architects
Apache HBase for Architects
 
Introduction to PostgreSQL
Introduction to PostgreSQLIntroduction to PostgreSQL
Introduction to PostgreSQL
 
Apache phoenix
Apache phoenixApache phoenix
Apache phoenix
 
New Security Features in Apache HBase 0.98: An Operator's Guide
New Security Features in Apache HBase 0.98: An Operator's GuideNew Security Features in Apache HBase 0.98: An Operator's Guide
New Security Features in Apache HBase 0.98: An Operator's Guide
 
Cassandra Explained
Cassandra ExplainedCassandra Explained
Cassandra Explained
 
Advanced Sqoop
Advanced Sqoop Advanced Sqoop
Advanced Sqoop
 
Cassandra 2.1 boot camp, Read/Write path
Cassandra 2.1 boot camp, Read/Write pathCassandra 2.1 boot camp, Read/Write path
Cassandra 2.1 boot camp, Read/Write path
 
Mongodb - NoSql Database
Mongodb - NoSql DatabaseMongodb - NoSql Database
Mongodb - NoSql Database
 
Analyze corefile and backtraces with GDB for Mysql/MariaDB on Linux - Nilanda...
Analyze corefile and backtraces with GDB for Mysql/MariaDB on Linux - Nilanda...Analyze corefile and backtraces with GDB for Mysql/MariaDB on Linux - Nilanda...
Analyze corefile and backtraces with GDB for Mysql/MariaDB on Linux - Nilanda...
 
Postgresql
PostgresqlPostgresql
Postgresql
 
Redis Functions, Data Structures for Web Scale Apps
Redis Functions, Data Structures for Web Scale AppsRedis Functions, Data Structures for Web Scale Apps
Redis Functions, Data Structures for Web Scale Apps
 
Intro to HBase Internals & Schema Design (for HBase users)
Intro to HBase Internals & Schema Design (for HBase users)Intro to HBase Internals & Schema Design (for HBase users)
Intro to HBase Internals & Schema Design (for HBase users)
 
Learning postgresql
Learning postgresqlLearning postgresql
Learning postgresql
 

Similaire à 8. key value databases laboratory

NoSQL databases pros and cons
NoSQL databases pros and consNoSQL databases pros and cons
NoSQL databases pros and consFabio Fumarola
 
MySQL Baics - Texas Linxufest beginners tutorial May 31st, 2019
MySQL Baics - Texas Linxufest beginners tutorial May 31st, 2019MySQL Baics - Texas Linxufest beginners tutorial May 31st, 2019
MySQL Baics - Texas Linxufest beginners tutorial May 31st, 2019Dave Stokes
 
Microservices - Is it time to breakup?
Microservices - Is it time to breakup? Microservices - Is it time to breakup?
Microservices - Is it time to breakup? Dave Nielsen
 
Scylla Summit 2018: How Scylla Helps You to be a Better Application Developer
Scylla Summit 2018: How Scylla Helps You to be a Better Application DeveloperScylla Summit 2018: How Scylla Helps You to be a Better Application Developer
Scylla Summit 2018: How Scylla Helps You to be a Better Application DeveloperScyllaDB
 
[System design] Design a tweeter-like system
[System design] Design a tweeter-like system[System design] Design a tweeter-like system
[System design] Design a tweeter-like systemAree Oh
 
Move a successful onpremise oltp application to the cloud
Move a successful onpremise oltp application to the cloudMove a successful onpremise oltp application to the cloud
Move a successful onpremise oltp application to the cloudIke Ellis
 
テスト用のプレゼンテーション
テスト用のプレゼンテーションテスト用のプレゼンテーション
テスト用のプレゼンテーションgooseboi
 
Redis everywhere - PHP London
Redis everywhere - PHP LondonRedis everywhere - PHP London
Redis everywhere - PHP LondonRicard Clau
 
Access Data from XPages with the Relational Controls
Access Data from XPages with the Relational ControlsAccess Data from XPages with the Relational Controls
Access Data from XPages with the Relational ControlsTeamstudio
 
Introduction to column oriented databases in PHP
Introduction to column oriented databases in PHPIntroduction to column oriented databases in PHP
Introduction to column oriented databases in PHPZend by Rogue Wave Software
 
Manchester Serverless Meetup - July 2018
Manchester Serverless Meetup - July 2018Manchester Serverless Meetup - July 2018
Manchester Serverless Meetup - July 2018Jonathan Vines
 
Better Laziness Through Hypermedia -- Designing a Hypermedia Client
Better Laziness Through Hypermedia -- Designing a Hypermedia ClientBetter Laziness Through Hypermedia -- Designing a Hypermedia Client
Better Laziness Through Hypermedia -- Designing a Hypermedia ClientPete Gamache
 
Hadoop & no sql new generation database systems
Hadoop & no sql   new generation database systemsHadoop & no sql   new generation database systems
Hadoop & no sql new generation database systemsramazan fırın
 
MongoDB.local Atlanta: MongoDB @ Sensus: Xylem IoT and MongoDB
MongoDB.local Atlanta: MongoDB @ Sensus: Xylem IoT and MongoDBMongoDB.local Atlanta: MongoDB @ Sensus: Xylem IoT and MongoDB
MongoDB.local Atlanta: MongoDB @ Sensus: Xylem IoT and MongoDBMongoDB
 
PlayStation and Searchable Cassandra Without Solr (Dustin Pham & Alexander Fi...
PlayStation and Searchable Cassandra Without Solr (Dustin Pham & Alexander Fi...PlayStation and Searchable Cassandra Without Solr (Dustin Pham & Alexander Fi...
PlayStation and Searchable Cassandra Without Solr (Dustin Pham & Alexander Fi...DataStax
 
ppt 2.pptxandxikcicncmk0wufjepfc09eufcdc
ppt 2.pptxandxikcicncmk0wufjepfc09eufcdcppt 2.pptxandxikcicncmk0wufjepfc09eufcdc
ppt 2.pptxandxikcicncmk0wufjepfc09eufcdcashimavashisth001
 
BSides SG Practical Red Teaming Workshop
BSides SG Practical Red Teaming WorkshopBSides SG Practical Red Teaming Workshop
BSides SG Practical Red Teaming WorkshopAjay Choudhary
 

Similaire à 8. key value databases laboratory (20)

REDIS327
REDIS327REDIS327
REDIS327
 
NoSQL databases pros and cons
NoSQL databases pros and consNoSQL databases pros and cons
NoSQL databases pros and cons
 
MySQL Baics - Texas Linxufest beginners tutorial May 31st, 2019
MySQL Baics - Texas Linxufest beginners tutorial May 31st, 2019MySQL Baics - Texas Linxufest beginners tutorial May 31st, 2019
MySQL Baics - Texas Linxufest beginners tutorial May 31st, 2019
 
Mini-Training: Redis
Mini-Training: RedisMini-Training: Redis
Mini-Training: Redis
 
Microservices - Is it time to breakup?
Microservices - Is it time to breakup? Microservices - Is it time to breakup?
Microservices - Is it time to breakup?
 
Scylla Summit 2018: How Scylla Helps You to be a Better Application Developer
Scylla Summit 2018: How Scylla Helps You to be a Better Application DeveloperScylla Summit 2018: How Scylla Helps You to be a Better Application Developer
Scylla Summit 2018: How Scylla Helps You to be a Better Application Developer
 
[System design] Design a tweeter-like system
[System design] Design a tweeter-like system[System design] Design a tweeter-like system
[System design] Design a tweeter-like system
 
Move a successful onpremise oltp application to the cloud
Move a successful onpremise oltp application to the cloudMove a successful onpremise oltp application to the cloud
Move a successful onpremise oltp application to the cloud
 
テスト用のプレゼンテーション
テスト用のプレゼンテーションテスト用のプレゼンテーション
テスト用のプレゼンテーション
 
Redis everywhere - PHP London
Redis everywhere - PHP LondonRedis everywhere - PHP London
Redis everywhere - PHP London
 
Access Data from XPages with the Relational Controls
Access Data from XPages with the Relational ControlsAccess Data from XPages with the Relational Controls
Access Data from XPages with the Relational Controls
 
Introduction to column oriented databases in PHP
Introduction to column oriented databases in PHPIntroduction to column oriented databases in PHP
Introduction to column oriented databases in PHP
 
Manchester Serverless Meetup - July 2018
Manchester Serverless Meetup - July 2018Manchester Serverless Meetup - July 2018
Manchester Serverless Meetup - July 2018
 
Redis Labcamp
Redis LabcampRedis Labcamp
Redis Labcamp
 
Better Laziness Through Hypermedia -- Designing a Hypermedia Client
Better Laziness Through Hypermedia -- Designing a Hypermedia ClientBetter Laziness Through Hypermedia -- Designing a Hypermedia Client
Better Laziness Through Hypermedia -- Designing a Hypermedia Client
 
Hadoop & no sql new generation database systems
Hadoop & no sql   new generation database systemsHadoop & no sql   new generation database systems
Hadoop & no sql new generation database systems
 
MongoDB.local Atlanta: MongoDB @ Sensus: Xylem IoT and MongoDB
MongoDB.local Atlanta: MongoDB @ Sensus: Xylem IoT and MongoDBMongoDB.local Atlanta: MongoDB @ Sensus: Xylem IoT and MongoDB
MongoDB.local Atlanta: MongoDB @ Sensus: Xylem IoT and MongoDB
 
PlayStation and Searchable Cassandra Without Solr (Dustin Pham & Alexander Fi...
PlayStation and Searchable Cassandra Without Solr (Dustin Pham & Alexander Fi...PlayStation and Searchable Cassandra Without Solr (Dustin Pham & Alexander Fi...
PlayStation and Searchable Cassandra Without Solr (Dustin Pham & Alexander Fi...
 
ppt 2.pptxandxikcicncmk0wufjepfc09eufcdc
ppt 2.pptxandxikcicncmk0wufjepfc09eufcdcppt 2.pptxandxikcicncmk0wufjepfc09eufcdc
ppt 2.pptxandxikcicncmk0wufjepfc09eufcdc
 
BSides SG Practical Red Teaming Workshop
BSides SG Practical Red Teaming WorkshopBSides SG Practical Red Teaming Workshop
BSides SG Practical Red Teaming Workshop
 

Plus de Fabio Fumarola

11. From Hadoop to Spark 2/2
11. From Hadoop to Spark 2/211. From Hadoop to Spark 2/2
11. From Hadoop to Spark 2/2Fabio Fumarola
 
11. From Hadoop to Spark 1:2
11. From Hadoop to Spark 1:211. From Hadoop to Spark 1:2
11. From Hadoop to Spark 1:2Fabio Fumarola
 
10b. Graph Databases Lab
10b. Graph Databases Lab10b. Graph Databases Lab
10b. Graph Databases LabFabio Fumarola
 
9. Document Oriented Databases
9. Document Oriented Databases9. Document Oriented Databases
9. Document Oriented DatabasesFabio Fumarola
 
8. column oriented databases
8. column oriented databases8. column oriented databases
8. column oriented databasesFabio Fumarola
 
7. Key-Value Databases: In Depth
7. Key-Value Databases: In Depth7. Key-Value Databases: In Depth
7. Key-Value Databases: In DepthFabio Fumarola
 
6 Data Modeling for NoSQL 2/2
6 Data Modeling for NoSQL 2/26 Data Modeling for NoSQL 2/2
6 Data Modeling for NoSQL 2/2Fabio Fumarola
 
5 Data Modeling for NoSQL 1/2
5 Data Modeling for NoSQL 1/25 Data Modeling for NoSQL 1/2
5 Data Modeling for NoSQL 1/2Fabio Fumarola
 
2 Linux Container and Docker
2 Linux Container and Docker2 Linux Container and Docker
2 Linux Container and DockerFabio Fumarola
 
1. Introduction to the Course "Designing Data Bases with Advanced Data Models...
1. Introduction to the Course "Designing Data Bases with Advanced Data Models...1. Introduction to the Course "Designing Data Bases with Advanced Data Models...
1. Introduction to the Course "Designing Data Bases with Advanced Data Models...Fabio Fumarola
 
An introduction to maven gradle and sbt
An introduction to maven gradle and sbtAn introduction to maven gradle and sbt
An introduction to maven gradle and sbtFabio Fumarola
 
Develop with linux containers and docker
Develop with linux containers and dockerDevelop with linux containers and docker
Develop with linux containers and dockerFabio Fumarola
 
Linux containers and docker
Linux containers and dockerLinux containers and docker
Linux containers and dockerFabio Fumarola
 
A Parallel Algorithm for Approximate Frequent Itemset Mining using MapReduce
A Parallel Algorithm for Approximate Frequent Itemset Mining using MapReduce A Parallel Algorithm for Approximate Frequent Itemset Mining using MapReduce
A Parallel Algorithm for Approximate Frequent Itemset Mining using MapReduce Fabio Fumarola
 

Plus de Fabio Fumarola (19)

11. From Hadoop to Spark 2/2
11. From Hadoop to Spark 2/211. From Hadoop to Spark 2/2
11. From Hadoop to Spark 2/2
 
11. From Hadoop to Spark 1:2
11. From Hadoop to Spark 1:211. From Hadoop to Spark 1:2
11. From Hadoop to Spark 1:2
 
10b. Graph Databases Lab
10b. Graph Databases Lab10b. Graph Databases Lab
10b. Graph Databases Lab
 
10. Graph Databases
10. Graph Databases10. Graph Databases
10. Graph Databases
 
9. Document Oriented Databases
9. Document Oriented Databases9. Document Oriented Databases
9. Document Oriented Databases
 
8. column oriented databases
8. column oriented databases8. column oriented databases
8. column oriented databases
 
7. Key-Value Databases: In Depth
7. Key-Value Databases: In Depth7. Key-Value Databases: In Depth
7. Key-Value Databases: In Depth
 
6 Data Modeling for NoSQL 2/2
6 Data Modeling for NoSQL 2/26 Data Modeling for NoSQL 2/2
6 Data Modeling for NoSQL 2/2
 
5 Data Modeling for NoSQL 1/2
5 Data Modeling for NoSQL 1/25 Data Modeling for NoSQL 1/2
5 Data Modeling for NoSQL 1/2
 
3 Git
3 Git3 Git
3 Git
 
2 Linux Container and Docker
2 Linux Container and Docker2 Linux Container and Docker
2 Linux Container and Docker
 
1. Introduction to the Course "Designing Data Bases with Advanced Data Models...
1. Introduction to the Course "Designing Data Bases with Advanced Data Models...1. Introduction to the Course "Designing Data Bases with Advanced Data Models...
1. Introduction to the Course "Designing Data Bases with Advanced Data Models...
 
Scala and spark
Scala and sparkScala and spark
Scala and spark
 
Hbase an introduction
Hbase an introductionHbase an introduction
Hbase an introduction
 
An introduction to maven gradle and sbt
An introduction to maven gradle and sbtAn introduction to maven gradle and sbt
An introduction to maven gradle and sbt
 
Develop with linux containers and docker
Develop with linux containers and dockerDevelop with linux containers and docker
Develop with linux containers and docker
 
Linux containers and docker
Linux containers and dockerLinux containers and docker
Linux containers and docker
 
08 datasets
08 datasets08 datasets
08 datasets
 
A Parallel Algorithm for Approximate Frequent Itemset Mining using MapReduce
A Parallel Algorithm for Approximate Frequent Itemset Mining using MapReduce A Parallel Algorithm for Approximate Frequent Itemset Mining using MapReduce
A Parallel Algorithm for Approximate Frequent Itemset Mining using MapReduce
 

Dernier

Heart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectHeart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectBoston Institute of Analytics
 
RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.natarajan8993
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...dajasot375
 
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Sapana Sha
 
Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceSapana Sha
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 217djon017
 
IMA MSN - Medical Students Network (2).pptx
IMA MSN - Medical Students Network (2).pptxIMA MSN - Medical Students Network (2).pptx
IMA MSN - Medical Students Network (2).pptxdolaknnilon
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPramod Kumar Srivastava
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfBoston Institute of Analytics
 
Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queensdataanalyticsqueen03
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...soniya singh
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Jack DiGiovanna
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsappssapnasaifi408
 
9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home ServiceSapana Sha
 
Defining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryDefining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryJeremy Anderson
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改yuu sss
 
MK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docxMK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docxUnduhUnggah1
 
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhijennyeacort
 

Dernier (20)

Heart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis ProjectHeart Disease Classification Report: A Data Analysis Project
Heart Disease Classification Report: A Data Analysis Project
 
RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.RABBIT: A CLI tool for identifying bots based on their GitHub events.
RABBIT: A CLI tool for identifying bots based on their GitHub events.
 
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
Indian Call Girls in Abu Dhabi O5286O24O8 Call Girls in Abu Dhabi By Independ...
 
E-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptxE-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptx
 
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
Saket, (-DELHI )+91-9654467111-(=)CHEAP Call Girls in Escorts Service Saket C...
 
Call Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts ServiceCall Girls In Dwarka 9654467111 Escorts Service
Call Girls In Dwarka 9654467111 Escorts Service
 
Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2Easter Eggs From Star Wars and in cars 1 and 2
Easter Eggs From Star Wars and in cars 1 and 2
 
IMA MSN - Medical Students Network (2).pptx
IMA MSN - Medical Students Network (2).pptxIMA MSN - Medical Students Network (2).pptx
IMA MSN - Medical Students Network (2).pptx
 
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptxPKS-TGC-1084-630 - Stage 1 Proposal.pptx
PKS-TGC-1084-630 - Stage 1 Proposal.pptx
 
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdfPredicting Salary Using Data Science: A Comprehensive Analysis.pdf
Predicting Salary Using Data Science: A Comprehensive Analysis.pdf
 
Top 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In QueensTop 5 Best Data Analytics Courses In Queens
Top 5 Best Data Analytics Courses In Queens
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
 
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
Building on a FAIRly Strong Foundation to Connect Academic Research to Transl...
 
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /WhatsappsBeautiful Sapna Vip  Call Girls Hauz Khas 9711199012 Call /Whatsapps
Beautiful Sapna Vip Call Girls Hauz Khas 9711199012 Call /Whatsapps
 
9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service9654467111 Call Girls In Munirka Hotel And Home Service
9654467111 Call Girls In Munirka Hotel And Home Service
 
Defining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data StoryDefining Constituents, Data Vizzes and Telling a Data Story
Defining Constituents, Data Vizzes and Telling a Data Story
 
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
专业一比一美国俄亥俄大学毕业证成绩单pdf电子版制作修改
 
MK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docxMK KOMUNIKASI DATA (TI)komdat komdat.docx
MK KOMUNIKASI DATA (TI)komdat komdat.docx
 
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
Deep Generative Learning for All - The Gen AI Hype (Spring 2024)
 
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝DelhiRS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
RS 9000 Call In girls Dwarka Mor (DELHI)⇛9711147426🔝Delhi
 

8. key value databases laboratory

  • 2. Motivation • The programming community considers that key- value cannot be used as replacement for a relational database. • Here we show how a key-value layer is an effective data model to implement many kinds of applications. 2
  • 3. A Twitter Clone • One of the most successful new Internet services of recent times is Twitter. • Since its launch it has exploded from niche usage to usage by the general populace, with celebrities such as Oprah Winfrey, Britney Spears, and Shaquille O'Neal, and politicians such as Barack Obama and Al Gore jumping into it. 3
  • 4. Why Twitter? • Simple: it does not care what you share, as a long it is less than 140 characters • A means to have public conversation: Twitter allows a user to tweet and have users respond using '@' reply, comment, or re-tweet • Fan versus friend • Understanding user behavior • Easy to share through text messaging • Easy to access through multiple devices and applications 4
  • 5. Main Features • Allow users to post status updates (known as 'tweets' in Twitter) to the public. • Allow users to follow and unfollow other users. Users can follow any other user but it is not reciprocal. • Allow users to send public messages directed to particular users using the @ replies convention (in Twitter this is known as mentions) 5
  • 6. Main Features • Allow users to send direct messages to other users, messages are private to the sender and the recipient user only (direct messages are only to a single recipient). • Allow users to re-tweet or forward another user's status in their own status update. • Provide a public timeline where all statuses are publicly available for viewing. • Provide APIs to allow external applications access. 6
  • 8. Redis CLI • Save a Key Value – SET foo bar • Get a value for a key – GET foo => bar • Del a key value – DEL foo 8
  • 9. Redis CLI • Increment a value for a key – SET foo 10 – INCR foo => 11 – INCR foo => 12 – INCR foo => 13 • INCR is an atomic operation – x = GET foo – x = x + 1 – SET foo x 9
  • 10. Redis CLI • The problem with this kind of operation is when multiple client update the same key – x = GET foo (yields 10) – y = GET foo (yields 10) – x = x + 1 (x is now 11) – y = y + 1 (y is now 11) – SET foo x (foo is now 11) – SET foo y (foo is now 11) 10
  • 11. Redis CLI: LIST • Beyond key-value stores: lists – LPUSH mylist a (now mylist holds 'a') – LPUSH mylist b (now mylist holds 'b','a') – LPUSH mylist c (now mylist holds 'c','b','a') • LPUSH means Left Push • There is also the operation RPUSH • This is very useful for our Twitter clone. User updates can be added to a list stored in username:updates, for instance. 11
  • 12. Redis CLI: LIST • LRANGE returns a range from the list – LRANGE mylist 0 1 => c,b – LRANGE mylist 0 -1 => c,b,a • The last-index argument can be negative, with a special meaning: -1 is the last element of the list, -2 the penultimate, and so on 12
  • 13. Redis CLI: SET • SADD is the add to set operation • SREM is the remove from set operation • SINTER is the perform intersection operation • SCARD to get the cardinality of a Set • SMEMBERS to return all the members of a Set. 13
  • 14. Redis CLI: SET – SADD myset a – SADD myset b – SADD myset foo – SADD myset bar – SCARD myset => 4 – SMEMBERS myset => bar,a,foo,b 14
  • 15. Redis CLI: Sorted SET • Sorted Set commands are prefixed with Z. The following is an example of Sorted Sets usage: – ZADD zset 10 a – ZADD zset 5 b – ZADD zset 12.55 c – ZRANGE zset 0 -1 => b,a,c • In the above example we added a few elements with ZADD, and later retrieved the elements with ZRANGE 15
  • 16. Redis CLI: Sorted SET • The elements are returned in order according to their score. • In order to check if a given element exists, and also to retrieve its score if it exists, we use the ZSCORE command: – ZSCORE zset a => 10 – ZSCORE zset non_existing_element => NULL 16
  • 17. Redis CLI: HASH • Redis Hashes are basically like Ruby or Python hashes, a collection of fields associated with values: – HMSET myuser name Salvatore surname Sanfilippo country Italy – HGET myuser surname => Sanfilippo 17
  • 19. Data Layout • When working with a relational database, a database schema must be designed so that we'd know the tables, indexes, and so on that the database will contain. • We don't have tables in Redis, so what do we need to design? • We need to identify what keys are needed to represent our objects and what kind of values this keys need to hold. 19
  • 20. Users • We need to represent users, of course, with their – username, userid, password, the set of users following a given user, the set of users a given user follows, and so on. • The first question is, how should we identify a user? • Asolution is to associate a unique ID with every user. • Every other reference to this user will be done by id. – INCR next_user_id => 1000 – HMSET user:1000 username antirez password p1pp0 20
  • 21. Users • Besides the fields already defined, we need some more stuff in order to fully define a User • For example, sometimes it can be useful to be able to get the user ID from the username, so every time we add an user, we also populate the users key, which is an Hash, with the username as field, and its ID as value. – HSET users antirez 1000 21
  • 22. Users – HSET users antirez 1000 • We are only able to access data in a direct way, without secondary indexes. • It's not possible to tell Redis to return the key that holds a specific value. • This new paradigm is forcing us to organize data so that everything is accessible by primary key, speaking in relational DB terms. 22
  • 23. Followers, following and updates • A user might have users who follow them, which we'll call their followers. • A user might follow other users, which we'll call a following. • We have a perfect data structure for this. That is... Sorted Set. 23
  • 24. Followers, following and updates • So let's define our keys: – followers:1000 => Sorted Set of uids of all the followers users – following:1000 => Sorted Set of uids of all the following users • We can add new followers with: – ZADD followers:1000 1401267618 1234 => Add user 1234 with time 1401267618 24
  • 25. Followers, following and updates • Another important thing we need is a place were we can add the updates to display in the user's home page. • We'll need to access this data in chronological order later • Basically every new update will be LPUSHed in the user updates key, and thanks to LRANGE, we can implement pagination and so on. 25
  • 26. Followers, following and updates • Note, we use the words updates and posts interchangeably, since updates are actually "little posts" in some way. – posts:1000 => a List of post ids - every new post is LPUSHed here. • This list is basically the User timeline. • We'll push the IDs of her/his own posts, and, the IDs of all the posts of created by the following users. Basically we implement a write fanout. 26
  • 27. Following Users • We need to create following / follower relationships. – If user ID 1000 (antirez) wants to follow user ID 5000 (pippo), we need to create both a following and a follower relationship. • We just need to ZADD calls: – ZADD following:1000 5000 – ZADD followers:5000 1000 27
  • 28. Following Users • Note the same pattern again and again. • In theory with a relational database the list of following and followers would be contained in a single table with fields like following_id and follower_id. • With a key-value DB things are a bit different since we need to set both the 1000 is following 5000 and 5000 is followed by 1000 relations. 28
  • 29. Making it horizontally scalable • Our clone is extremely fast, without any kind of cache. • On a very slow and loaded server, an apache benchmark with 100 parallel clients issuing 100000 requests measured the average pageview to take 5 milliseconds. • This means we can serve millions of users every day with just a single Linux box. 29
  • 30. Making it horizontally scalable • However you can't go with a single server forever, how do you scale a key-value store? • It does not perform any multi-keys operation, so making it scalable is simple: 1. you may use client-side sharding, 2. or something like a sharding proxy like Twemproxy, 3. or the upcoming Redis Cluster. 30

Notes de l'éditeur

  1. We use the next_user_id key in order to always get an unique ID for every new user. Then we use this unique ID to name the key holding an Hash with user's data. This is a common design pattern with key-values stores! Keep it in mind.