Python redis talk

Redis and Python

by Josiah Carlson
@dr_josiah
dr-josiah.blogspot.com
bit.ly/redis-in-action

Redis and Python;
It's PB & J time
by Josiah Carlson
@dr_josiah

What will be covered
• Who am I?
• What is Redis?
• Why Redis with Python?
• Cool stuff you can do by combining them

Who am I?
• A Python user for 12+ years
• Former python-dev bike-shedder
• Former maintainer of Python async sockets libraries
• Author of a few small OS projects
o rpqueue, parse-crontab, async_http, timezone-utils, PyPE

• Worked at some cool places you've never heard of
(Networks In Motion, Ad.ly)
• Cool places you have (Google)
• And cool places you will (ChowNow)
• Heavy user of Redis
• Author of upcoming Redis in Action

What is Redis?
• In-memory database/data structure server
o Limited to main memory; vm and diskstore defunct
• Persistence via snapshot or append-only file
• Support for master/slave replication (multiple slaves
and slave chaining supported)
o No master-master, don't even try
o Client-side sharding
o Cluster is in-progress
• Five data structures + publish/subscribe
o Strings, Lists, Sets, Hashes, Sorted Sets (ZSETs)
• Server-side scripting with Lua in Redis 2.6

What is Redis? (compared to other
databases/caches)
• Memcached
o in-memory, no-persistence, counters, strings, very fast, multi-threaded

• Redis
o in-memory, optionally persisted, data structures, very fast, server-side
scripting, single-threaded
• MongoDB
o on-disk, speed inversely related to data integrity, bson, master/slave,
sharding, multi-master, server-side mapreduce, database-level locking
• Riak
o on-disk, pluggable data stores, multi-master sharding, RESTful API,
server-side map-reduce, (Erlang + C)
• MySQL/PostgreSQL
o on-disk/in-memory, pluggable data stores, master/slave, sharding,
stored procedures, ...

What is Redis? (Strings)
• Really scalars of a few different types
o Character strings
 concatenate values to the end
 get/set individual bits
 get/set byte ranges
o Integers (platform long int)
 increment/decrement
 auto "casting"
o Floats (IEEE 754 FP Double)
 increment/decrement
 auto "casting"

What is Redis? (Lists)
• Doubly-linked list of character strings
o Push/pop from both ends
o [Blocking] pop from multiple lists
o [Blocking] pop from one list, push on another
o Get/set/search for item in a list
o Sortable

What is Redis? (Sets)
• Unique unordered sequence of character
strings
o Backed by a hash table
o Add, remove, check membership, pop, random pop
o Set intersection, union, difference
o Sortable

What is Redis? (Hashes)
• Key-value mapping inside a key
o Get/Set/Delete single/multiple
o Increment values by ints/floats
o Bulk fetch of Keys/Values/Both
o Sort-of like a small version of Redis that only
supports strings/ints/floats

What is Redis? (Sorted Sets -
ZSETs)
• Like a Hash, with 'members' and 'scores',
scores limited to float values
o Get, set, delete, increment
o Can be accessed by the sorted order of the
(score,member) pair
 By score
 By index

What is Redis? (Publish/Subscribe)
• Readers subscribe to "channels" (exact
strings or patterns)
• Writers publish to channels, broadcasting to
all subscribers
• Messages are transient

Why Redis with Python?
• The power of Python lies in:
o Reasonably sane syntax/semantics
o Easy manipulation of data and data structures
o Large and growing community
• Redis also has:
o Reasonably sane syntax/semantics
o Easy manipulation of data and data structures
o Medium-sized and growing community
o Available as remote server
 Like a remote IPython, only for data
 So useful, people have asked for a library version

Per-hour and Per-day hit counters
from itertools import imap
import redis
def process_lines(prefix, logfile):
conn = redis.Redis()
for log in imap(parse_line, open(logfile, 'rb')):
time = log.timestamp.isoformat()
hour = time.partition(':')[0]
day = time.partition('T')[0]
conn.zincrby(prefix + hour, log.path)
conn.zincrby(prefix + day, log.path)
conn.expire(prefix + hour, 7*86400)
conn.expire(prefix + day, 30*86400)

Per-hour and Per-day hit counters
(with pipelines for speed)
from itertools import imap
import redis
def process_lines(prefix, logfile):
pipe = redis.Redis().pipeline(False)
for i, log in enumerate(imap(parse_line, open(logfile, 'rb'))):
time = log.timestamp.isoformat()
hour = time.partition(':')[0]
day = time.partition('T')[0]
pipe.zincrby(prefix + hour, log.path)
pipe.zincrby(prefix + day, log.path)
pipe.expire(prefix + hour, 7*86400)
pipe.expire(prefix + day, 30*86400)
if not i % 1000:
pipe.execute()
pipe.execute()

Simple task queue - add/run items
import json
import redis

def add_item(queue, name, *args, **kwargs):
redis.Redis().rpush(queue,
json.dumps([name, args, kwargs]))

def execute_one(queues):
item = redis.Redis().blpop(queues, 30)
name, args, kwargs = json.loads(item)
REGISTRY[name](*args, **kwargs)

Simple task queue - register tasks
REGISTRY = {}
def task(queue):
def wrapper(function):
def defer(*args, **kwargs):
add_item(queue, name, *args, **kwargs)
name = function.__name__
if name in REGISTRY:
raise Exception(
"Duplicate callback %s"%(name,))
REGISTRY[name] = function
return defer
if isinstance(queue, str):
return wrapper
function, queue = queue, 'default'
return wrapper(function)

Simple task queue – register tasks
@task('high')
def do_something(arg):
pass

@task
def do_something_else(arg):
pass

Cool stuff to do...
• Reddit • Publish/Subscribe
• Caching • Messaging
• Cookies • Search engines
• Analytics • Ad targeting
• Configuration • Twitter
management • Chat rooms
• Autocomplete • Job search
• Distributed locks • ...
• Counting Semaphores
• Task queues

Thank you
@dr_josiah

Questions?

Python redis talk

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (20)

En vedette

En vedette (7)

Similaire à Python redis talk

Similaire à Python redis talk (20)

Dernier

Dernier (20)

Python redis talk