No sql presentation

No-SQL : Memcached
a case study
Saifuddin Kaijar

What is Key-Value Pair Store ?
Key-value database From Wikipedia, the free encyclopedia
A key-value store, or key-value database, is a data storage paradigm designed
for storing, retrieving, and managing associative arrays, a data structure more
commonly known today as a dictionary or hash.

Why not RDBMS/SQL ?
Relational Data - Example Table Structure

Why not RDBMS/SQL ?
Relational databases, due to their referential integrity
property, it’s very difficult for them to offer partition
tolerance and horizontal scaling.
Need to Scale
SQL join is very costly

Why not RDBMS/SQL ?
In addition to scale-out requirement, there was another
handicap in relational databases: Schema-on-write. The
static schema of relational databases is perfect to make
faster known-data-queries.
But Internet data is schema-less and sometimes you need
some kind of flexible schema to manage unknown-data.

If not SQL then What ????
Not only SQL refers to the different types of databases capable of solving
the volume and variety problems that relational databases can’t solve
properly.

The Basics of Key/Value Caching
A popular solution to this problem is to augment the RDBMS with a Key/Value
(K/V) cache such as memcached.
This optimization offloads most read operations from the database. Thus, reads
are faster because they’re serviced primarily from the cache, and writes
(transactions) are faster because they’re not contending with reads.

History of Memcached
Brad Fitzpatrick from Danga Interactive developed
memcached to enhance speed of livejournal.com, which was
then doing 20 million+ dynamic KV’s per day.
Memcached reduced the database load to almost nothing,
yielding faster page load time and better resource utilization.
Facebook is the biggest user of memcached after live journal.
They have > 800 dedicated memcached servers that serves
28 terabytes of memory .

Three W’s
What is Memcached?
Memcached is a high-performance, in-memory distributed memory object caching system, generic
in nature, but intended for use in speeding up dynamic web applications by alleviating database load.
When can we use it?
Anywhere, if you have spare RAM. Mostly used in wiki, social networking and book marking sites.
Why should we use it?
If you have a high-traffic site that is dynamically generated with a high database load that contains
mostly read threads then memcached can help lighten the load on your database.

Current Memcached Architecture
Memcached has 4 main data structures, including
● A hash table to locate a cache item.
● Least Recently Used (LRU) list to determine cache item
eviction order when the cache is full.
● A cache item data structure for holding the key, data,
flags, and pointers.
● A slab allocator, which is the memory manager for cache
item data structures.

Memcached has 4 main data structures, including
● A hash table to locate a cache item.
● Least Recently Used (LRU) list to determine cache item
eviction order when the cache is full.
● A cache item data structure for holding the key, data,
flags, and pointers.
Basic Memcached Data Structure for Data Storing is BLOB :
Header i.e. Access Time, Size of Blob, Reference Count, Expire Time
Key
Payload Data

Basic memcached Operations
The interface to memcached supports the following methods for storing and retrieving information in the cache, and these are
consistent across all the different APIs, although the language specific mechanics might be different:
● get(key): Retrieves information from the cache.
● store(key, value [, expiry]): Sets the item associated with a key in the cache to the
specified value.
● add(key, value [, expiry]): Adds the key and associated value to the cache, if the
specified key does not already exist.
● replace(key, value [, expiry]): Replaces the item associated with the specified key, only
if the key already exists. The new value is given by the value parameter.
● delete(key [, time]): Deletes the key and its associated item from the cache. If you supply
a time, then adding another item with the specified key is blocked for the specified period.
● incr(key , value): Increments the item associated with the key by the specified value.
● decr(key , value): Decrements the item associated with the key by the specified value.
● flush_all: Invalidates (or expires) all the current items in the cache.

For the three primary commands (GET, STORE, and DELETE), the flow
of memcached is as follows:
1. Request arrives at NIC and is processed by libevent
2. A worker thread
○ Takes connection and data packet
○ Determines the command and data
3. A hash value is created from the key data.
4. The cache lock is acquired to begin hash table and LRU
processing. (Critical Section)
○ (STORE only)
■ Cache item memory is allocated.
■ Data is loaded into the cache item data structure
(flags, key, and value).
○ Hash table is traversed to GET, STORE, or DELETE the
cache item from the hash table.
○ LRU is modified to place move cache item in the front of
the LRU (GET, STORE) or remove it (DELETE).
○ Cache item flags are updated
5. Cache lock is released. (End of Critical Section)
6. Response is constructed.
7. (GET only) Cache lock is asserted.
○ Cache item reference counter is decremented.
○ Cache lock is released.
8. Response is transmitted back to requesting client (i.e. Front-end
Web server).

Problems of Memcached
● Cold Cache : Since the new memcached server node is empty at the start, all data requests
need to be serviced by the database servers in the RDBMS layers until that cache node warms up.
● No build-in Security from Memcached : Memcached has no security or authentication.
Please ensure that your server is appropriately firewalled, and that the port(s) used for memcached
servers are not publicly accessible.
● Non Persist : No persistence of data. Everything goes away when you restart it.
● Limited to Host Primary Memory

Scaling and Improving of Memcached
Web Application Data Base

Area of Improvement :
● Network stack for efficient request processing:
● Fast and scalable parallel data access:
● New data structures for key-value storage:

● Network stack for efficient request processing:
○ Network I/O is one of the most expensive processing steps for
in-memory key-value storage. TCP processing alone may consume
70% of CPU time on a many-core optimized key-value store
○ Direct NIC access

● Fast and scalable parallel data access:
○ Concurrent access vs Exclusive access
CPU Core
CPU Core
CPU Core
CPU Core
Shared Data
CPU Core
CPU Core
CPU Core
CPU Core
Concurrent access Exclusive access
Partition 0
Partition 1
Partition 2
Partition 3

● New data structures for key-value storage:
○ Improved Hashing Technique
○ Efficient Command Structure
○ Indexing in memcached
○ Improved Changing
○ Efficient memory management Slab Allocator

Memcached Demo
Memcached Server starting :
$ /usr/bin/memcached -m 64 -p 11211 -u memcache -l 127.0.0.1
Memcached Strong Key-Value :
char *keys[]= {"alok", "saif", "abhijit"};
char *values[]= {"red", "blue", "green"};
$ ./memcmcache_store
Memcached fetch value for corresponding key :
$ ./memcmcache_fetch 0

References :
● H. Lim, D. Han, D. G. Andersen, and M. Kaminsky, “MICA: A holistic
approach to fast in-memory key-value storage,” in NSDI, 2014.
● “Enhancing the Scalability of Memcached” James T Langston Jr (Intel)'s
picture Submitted by James T Langston Jr (Intel) on August 16, 2012
● Sheng Li, Hyeontaek Lim, Victor W. Lee, Jung Ho Ahn, Anuj Kalia, at. al.
ISCA’15, June 13-17, 2015, Portland, “Architecting to Achieve a Billion
Requests Per Second Throughput on a Single Key-Value Store Server
Platform”
● http://memcached.org/

No sql presentation

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (20)

Similaire à No sql presentation

Similaire à No sql presentation (20)

Dernier

Dernier (20)

No sql presentation