SlideShare a Scribd company logo
1 of 49
Download to read offline
Caching techinques in
                                 python
                                Michael Domanski
                                europython 2010


czwartek, 22 lipca 2010
who I am

                     • python developer, professionally for a few
                          years now
                     • experienced also in c and objective-c
                     • currently working for 10clouds.com


czwartek, 22 lipca 2010
Interesting intro

                     • a bit of theory
                     • common patterns
                     • common problems
                     • common solutions

czwartek, 22 lipca 2010
How I think about
                               cache

                     • imagine a giant dict storing all your data
                     • you have to manage all data manually
                     • or provide some automated behaviour


czwartek, 22 lipca 2010
similar to....

                     • manual memory managment in c
                     • cache is memory
                     • and you have to controll it manually


czwartek, 22 lipca 2010
profits


                     • improved performance
                     • ...?


czwartek, 22 lipca 2010
problems


                     • managing any type of memory is hard
                     • automation often have to be done custom
                          each time




czwartek, 22 lipca 2010
common patterns



czwartek, 22 lipca 2010
memoization



czwartek, 22 lipca 2010
• very old pattern (circa 1968)
                     • we own the name to Donald Mitchie



czwartek, 22 lipca 2010
how it works


                     • we assosciate input with output, and store
                          in somewhere
                     • based on the assumption that for a given
                          input, output is always the same




czwartek, 22 lipca 2010
code example
                 CACHE_DICT = {}

                 def cached(key):
                     def func_wrapper(func):
                         def arg_wrapper(*args, **kwargs):
                             if not key in CACHE_DICT:
                                  value = func(*args, **kwargs)
                                  CACHE_DICT[key] = value
                             return CACHE_DICT[key]
                         return arg_wrapper
                     return func_wrapper




czwartek, 22 lipca 2010
what if output can
                               change?

                     • our pattern is still usefull
                     • we simply need to add something


czwartek, 22 lipca 2010
cache invalidation



czwartek, 22 lipca 2010
There are only two hard problems in Computer
                           Science: cache invalidation and naming things
                                                                  Phil Karlton


czwartek, 22 lipca 2010
• basically, we update data in cache
                     • we need to know when and what to
                          change

                     • the more granular you want to be, the
                          harder it gets




czwartek, 22 lipca 2010
code example
                   def invalidate(key):
                     try:
                          del CACHE_DICT[key]
                     except KeyError:
                          print "someone tried to invalidate not present
                 key: %s" %key




czwartek, 22 lipca 2010
common problems



czwartek, 22 lipca 2010
invalidating too much/
                                not enough

                     • flushing all data any time something changes
                     • not flushing cache at all
                     • tragic effects


czwartek, 22 lipca 2010
@cached('key1')
                 def simple_function1():
                     return db_get(id=1)

                 @cached('key2')
                 def simple_function2():
                     return db_get(id=2)

                 # SUPPOSE THIS IS IN ANOTHER MODULE

                 @cached('big_key1')
                 def some_bigger_function():
                     """
                     this function depends on big_key1, key1 and key2
                     """
                     def inner_workings():
                         db_set(1, 'something totally new')
                     #######
                     ##   imagine 100 lines of code here :)
                     ######
                     inner_workings()

                          return [simple_function1(),simple_function2()]

                 if __name__ == '__main__':
                     simple_function1()
                     simple_function2()
                     a,b = some_bigger_function()
                     assert a == db_get(id=1), "this fails because we didn't invalidated cache properly"




czwartek, 22 lipca 2010
invalidating too soon/
                                  too late

                     • your cache have to be synchronised to you
                          db
                     • sometimes very hard to spot
                     • leads to tragic mistakes


czwartek, 22 lipca 2010
@cached('key1')
                 def simple_function1():
                     return db_get(id=1)

                 @cached('key2')
                 def simple_function2():
                     return db_get(id=2)

                 # SUPPOSE THIS IS IN ANOTHER MODULE

                 def some_bigger_function():
                     db_set(1, 'something')
                     value = simple_function1()
                     db_set(2, 'something else')
                     #### now we know we used 2 cached functions so....
                     invalidate('key1')
                     invalidate('key2')
                     #### now we know we are safe, but for a price
                     return simple_function2()

                 if __name__ == '__main__':
                     some_bigger_function()




czwartek, 22 lipca 2010
superposition of
                               dependancy
                     • somehow less obvious problem
                     • eventually you will start caching effects of
                          computation
                     • you have to know very preciselly of what
                          your data is dependant



czwartek, 22 lipca 2010
@cached('key1')
                 def simple_function1():
                     return db_get(id=1)

                 @cached('key2')
                 def simple_function2():
                     return db_get(id=2)

                 # SUPPOSE THIS IS IN ANOTHER MODULE

                 @cached('key')
                 def some_bigger_function():

                          return {
                              '1': simple_function1(),
                              '2': simple_function2(),
                              '3': db_get(id=3)
                          }

                 if __name__ == '__main__':
                     simple_function1()
                     # somewhere else
                     db_set(1, 'foobar')
                     # and again
                     db_set(3, 'bazbar')
                     invalidate('key')
                     # ooops, we forgot something
                     data = some_bigger_function()
                     assert data['1'] == db_get(id=1), "this fails because we didn't manage to invalidate all the
                 keys"




czwartek, 22 lipca 2010
summing up
                     • know your data....
                     • be aware what and when you cache
                     • take care when using cached data in
                          computation




czwartek, 22 lipca 2010
common solutions



czwartek, 22 lipca 2010
process level cache



czwartek, 22 lipca 2010
why?

                     • very fast access
                     • simple to implement
                     • very effective as long as you’re using single
                          process




czwartek, 22 lipca 2010
clever tricks with dicts



czwartek, 22 lipca 2010
code example
                 CACHE_DICT = {}

                 def cached(key):
                     def func_wrapper(func):
                         def arg_wrapper(*args, **kwargs):
                             if not key in CACHE_DICT:
                                  value = func(*args, **kwargs)
                                  CACHE_DICT[key] = value
                             return CACHE_DICT[key]
                         return arg_wrapper
                     return func_wrapper




czwartek, 22 lipca 2010
invalidation



czwartek, 22 lipca 2010
code example
                   def invalidate(key):
                     try:
                          del CACHE_DICT[key]
                     except KeyError:
                          print "someone tried to invalidate not present
                 key: %s" %key




czwartek, 22 lipca 2010
application level cache



czwartek, 22 lipca 2010
memcache



czwartek, 22 lipca 2010
• battle tested
                     • scales
                     • fast
                     • supports a few cool features
                     • behaves a lot like dict
                     • supports time-based expiration
czwartek, 22 lipca 2010
libraries?

                     • python-memcache
                     • python-libmemcache
                     • python-cmemcache
                     • pylibmc

czwartek, 22 lipca 2010
why no benchmarks

                     • not the point of this talk :)
                     • benchmarks are generic, caching is specific
                     • pick your flavour, think for yourself


czwartek, 22 lipca 2010
code example
                          cache = memcache.Client(['localhost:11211'])

                 def memcached(key):
                     def func_wrapper(func):
                         def arg_wrapper(*args, **kwargs):
                             value = cache.get(str(key))
                             if not value:
                                 value = func(*args, **kwargs)
                                 cache.set(str(key), value)
                             return value
                         return arg_wrapper
                     return func_wrapper




czwartek, 22 lipca 2010
invalidation



czwartek, 22 lipca 2010
code example
                          def mem_invalidate(key):
                            cache.set(str(key), None)




czwartek, 22 lipca 2010
batch key managment



czwartek, 22 lipca 2010
• what if I don’t want to expire each key
                          manually

                     • that’s a lot to remember
                     • and we have to be carefull :(


czwartek, 22 lipca 2010
groups?

                     • group keys into sets
                     • which are tied to one key per set
                     • expire one key, instead of twenty


czwartek, 22 lipca 2010
how to get there?

                     • store some extra data
                     • you can store dicts in cache
                     • and cache behaves like dict
                     • so it’s a case of comparing keys and values

czwartek, 22 lipca 2010
#we start with specified key and group
                 key='some_key'
                 group='some_group'

                 # now retrieve some data from memcached
                 data=memcached_client.get_multi(key, group)
                 # now data is a dict that should look like
                 #{'some_key' :{'group_key' : '1234',
                 #                  'value' : 'some_value' },
                 # 'some_group' : '1234'}
                 #
                 if data and (key in data) and (group in data):
                     if data[key]['group_key']==data[group]:
                         return data[key]['value']




czwartek, 22 lipca 2010
def cached(key, group_key='', exp_time=0 ):

          # we don't want to mix time based and event based expiration models
          if group_key : assert exp_time==0, "can't set expiration time for grouped keys"
          def f_wrapper(func):
              def arg_wrapper(*args, **kwargs):
                  value = None
                  if group_key:
                      data = cache.get_multi([tools.make_key(group_key)]+[tools.make_key(key)])
                      data_dict = data.get(tools.make_key(key))
                      if data_dict:
                           value = data_dict['value']
                           group_value = data_dict['group_value']
                           if group_value != data[tools.make_key(group_key)]:
                               value = None
                  else:
                      value = cache.get(key)
                  if not value:
                      value = func(*args, **kwargs)
                      if exp_time:
                           cache.set(tools.make_key(key), value, exp_time)
                      elif not group_key:
                           cache.set(tools.make_key(key), value)
                      else: # exp_time not set and we have group_keys
                           group_value = make_group_value(group_key)
                           data_dict = { 'value':value, 'group_value': group_value}
                           cache.set_multi({ tools.make_key(key):data_dict, tools.make_key(group_key):group_value })
                  return value
              arg_wrapper.__name__ = func.__name__
              return arg_wrapper
          return f_wrapper



czwartek, 22 lipca 2010
questions?



czwartek, 22 lipca 2010
code samples @
                       http://github.com/
                    mdomans/europython2010

czwartek, 22 lipca 2010
follow me

                 twitter: mdomans
                 blog:    blog.mdomans.com


czwartek, 22 lipca 2010

More Related Content

What's hot

Security Research2.0 - FIT 2008
Security Research2.0 - FIT 2008Security Research2.0 - FIT 2008
Security Research2.0 - FIT 2008Raffael Marty
 
IT Data Visualization - Sumit 2008
IT Data Visualization - Sumit 2008IT Data Visualization - Sumit 2008
IT Data Visualization - Sumit 2008Raffael Marty
 
VJET bringing the best of Java and JavaScript together
VJET bringing the best of Java and JavaScript togetherVJET bringing the best of Java and JavaScript together
VJET bringing the best of Java and JavaScript togetherJustin Early
 
Easy undo.key
Easy undo.keyEasy undo.key
Easy undo.keyzachwaugh
 
Database madness with_mongoengine_and_sql_alchemy
Database madness with_mongoengine_and_sql_alchemyDatabase madness with_mongoengine_and_sql_alchemy
Database madness with_mongoengine_and_sql_alchemyJaime Buelta
 
Rails' Next Top Model
Rails' Next Top ModelRails' Next Top Model
Rails' Next Top ModelAdam Keys
 
Drupal Entities - Emerging Patterns of Usage
Drupal Entities - Emerging Patterns of UsageDrupal Entities - Emerging Patterns of Usage
Drupal Entities - Emerging Patterns of UsageRonald Ashri
 
Zend Framework 1 + Doctrine 2
Zend Framework 1 + Doctrine 2Zend Framework 1 + Doctrine 2
Zend Framework 1 + Doctrine 2Ralph Schindler
 
BDD - Behavior Driven Development Webapps mit Groovy Spock und Geb
BDD - Behavior Driven Development Webapps mit Groovy Spock und GebBDD - Behavior Driven Development Webapps mit Groovy Spock und Geb
BDD - Behavior Driven Development Webapps mit Groovy Spock und GebChristian Baranowski
 
Core Data Performance Guide Line
Core Data Performance Guide LineCore Data Performance Guide Line
Core Data Performance Guide LineGagan Vishal Mishra
 
YUI3 Modules
YUI3 ModulesYUI3 Modules
YUI3 Modulesa_pipkin
 

What's hot (15)

Security Research2.0 - FIT 2008
Security Research2.0 - FIT 2008Security Research2.0 - FIT 2008
Security Research2.0 - FIT 2008
 
hibernate
hibernatehibernate
hibernate
 
Build your own entity with Drupal
Build your own entity with DrupalBuild your own entity with Drupal
Build your own entity with Drupal
 
IT Data Visualization - Sumit 2008
IT Data Visualization - Sumit 2008IT Data Visualization - Sumit 2008
IT Data Visualization - Sumit 2008
 
Django - sql alchemy - jquery
Django - sql alchemy - jqueryDjango - sql alchemy - jquery
Django - sql alchemy - jquery
 
VJET bringing the best of Java and JavaScript together
VJET bringing the best of Java and JavaScript togetherVJET bringing the best of Java and JavaScript together
VJET bringing the best of Java and JavaScript together
 
Easy undo.key
Easy undo.keyEasy undo.key
Easy undo.key
 
Spock and Geb in Action
Spock and Geb in ActionSpock and Geb in Action
Spock and Geb in Action
 
Database madness with_mongoengine_and_sql_alchemy
Database madness with_mongoengine_and_sql_alchemyDatabase madness with_mongoengine_and_sql_alchemy
Database madness with_mongoengine_and_sql_alchemy
 
Rails' Next Top Model
Rails' Next Top ModelRails' Next Top Model
Rails' Next Top Model
 
Drupal Entities - Emerging Patterns of Usage
Drupal Entities - Emerging Patterns of UsageDrupal Entities - Emerging Patterns of Usage
Drupal Entities - Emerging Patterns of Usage
 
Zend Framework 1 + Doctrine 2
Zend Framework 1 + Doctrine 2Zend Framework 1 + Doctrine 2
Zend Framework 1 + Doctrine 2
 
BDD - Behavior Driven Development Webapps mit Groovy Spock und Geb
BDD - Behavior Driven Development Webapps mit Groovy Spock und GebBDD - Behavior Driven Development Webapps mit Groovy Spock und Geb
BDD - Behavior Driven Development Webapps mit Groovy Spock und Geb
 
Core Data Performance Guide Line
Core Data Performance Guide LineCore Data Performance Guide Line
Core Data Performance Guide Line
 
YUI3 Modules
YUI3 ModulesYUI3 Modules
YUI3 Modules
 

Similar to Caching Techniques in Python: Memoization, Invalidation and Process Level Cache

Dojo for programmers (TXJS 2010)
Dojo for programmers (TXJS 2010)Dojo for programmers (TXJS 2010)
Dojo for programmers (TXJS 2010)Eugene Lazutkin
 
Bookmarklet型(云端)应用的前端架构
Bookmarklet型(云端)应用的前端架构Bookmarklet型(云端)应用的前端架构
Bookmarklet型(云端)应用的前端架构Chappell.Wat
 
Bookmarklet型(云端)应用的前端架构
Bookmarklet型(云端)应用的前端架构Bookmarklet型(云端)应用的前端架构
Bookmarklet型(云端)应用的前端架构taobao.com
 
Drupal 7: What's In It For You?
Drupal 7: What's In It For You?Drupal 7: What's In It For You?
Drupal 7: What's In It For You?karschsp
 
CapitalCamp Features
CapitalCamp FeaturesCapitalCamp Features
CapitalCamp FeaturesPhase2
 
TC39: How we work, what we are working on, and how you can get involved (dotJ...
TC39: How we work, what we are working on, and how you can get involved (dotJ...TC39: How we work, what we are working on, and how you can get involved (dotJ...
TC39: How we work, what we are working on, and how you can get involved (dotJ...Igalia
 
So you want to liberate your data?
So you want to liberate your data?So you want to liberate your data?
So you want to liberate your data?Mogens Heller Grabe
 
Pitfalls of Continuous Deployment
Pitfalls of Continuous DeploymentPitfalls of Continuous Deployment
Pitfalls of Continuous Deploymentzeeg
 
Unbundling the JavaScript module bundler - DublinJS July 2018
Unbundling the JavaScript module bundler - DublinJS July 2018Unbundling the JavaScript module bundler - DublinJS July 2018
Unbundling the JavaScript module bundler - DublinJS July 2018Luciano Mammino
 
D3 in Jupyter : PyData NYC 2015
D3 in Jupyter : PyData NYC 2015D3 in Jupyter : PyData NYC 2015
D3 in Jupyter : PyData NYC 2015Brian Coffey
 
Objective-C: a gentle introduction
Objective-C: a gentle introductionObjective-C: a gentle introduction
Objective-C: a gentle introductionGabriele Petronella
 

Similar to Caching Techniques in Python: Memoization, Invalidation and Process Level Cache (12)

Dojo for programmers (TXJS 2010)
Dojo for programmers (TXJS 2010)Dojo for programmers (TXJS 2010)
Dojo for programmers (TXJS 2010)
 
Bookmarklet型(云端)应用的前端架构
Bookmarklet型(云端)应用的前端架构Bookmarklet型(云端)应用的前端架构
Bookmarklet型(云端)应用的前端架构
 
Bookmarklet型(云端)应用的前端架构
Bookmarklet型(云端)应用的前端架构Bookmarklet型(云端)应用的前端架构
Bookmarklet型(云端)应用的前端架构
 
Drupal 7: What's In It For You?
Drupal 7: What's In It For You?Drupal 7: What's In It For You?
Drupal 7: What's In It For You?
 
CapitalCamp Features
CapitalCamp FeaturesCapitalCamp Features
CapitalCamp Features
 
TC39: How we work, what we are working on, and how you can get involved (dotJ...
TC39: How we work, what we are working on, and how you can get involved (dotJ...TC39: How we work, what we are working on, and how you can get involved (dotJ...
TC39: How we work, what we are working on, and how you can get involved (dotJ...
 
Active domain
Active domainActive domain
Active domain
 
So you want to liberate your data?
So you want to liberate your data?So you want to liberate your data?
So you want to liberate your data?
 
Pitfalls of Continuous Deployment
Pitfalls of Continuous DeploymentPitfalls of Continuous Deployment
Pitfalls of Continuous Deployment
 
Unbundling the JavaScript module bundler - DublinJS July 2018
Unbundling the JavaScript module bundler - DublinJS July 2018Unbundling the JavaScript module bundler - DublinJS July 2018
Unbundling the JavaScript module bundler - DublinJS July 2018
 
D3 in Jupyter : PyData NYC 2015
D3 in Jupyter : PyData NYC 2015D3 in Jupyter : PyData NYC 2015
D3 in Jupyter : PyData NYC 2015
 
Objective-C: a gentle introduction
Objective-C: a gentle introductionObjective-C: a gentle introduction
Objective-C: a gentle introduction
 

Recently uploaded

TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxLoriGlavin3
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Visualising and forecasting stocks using Dash
Visualising and forecasting stocks using DashVisualising and forecasting stocks using Dash
Visualising and forecasting stocks using Dashnarutouzumaki53779
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demoHarshalMandlekar2
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxBkGupta21
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 

Recently uploaded (20)

TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptxPasskey Providers and Enabling Portability: FIDO Paris Seminar.pptx
Passkey Providers and Enabling Portability: FIDO Paris Seminar.pptx
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Visualising and forecasting stocks using Dash
Visualising and forecasting stocks using DashVisualising and forecasting stocks using Dash
Visualising and forecasting stocks using Dash
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
Sample pptx for embedding into website for demo
Sample pptx for embedding into website for demoSample pptx for embedding into website for demo
Sample pptx for embedding into website for demo
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptx
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 

Caching Techniques in Python: Memoization, Invalidation and Process Level Cache

  • 1. Caching techinques in python Michael Domanski europython 2010 czwartek, 22 lipca 2010
  • 2. who I am • python developer, professionally for a few years now • experienced also in c and objective-c • currently working for 10clouds.com czwartek, 22 lipca 2010
  • 3. Interesting intro • a bit of theory • common patterns • common problems • common solutions czwartek, 22 lipca 2010
  • 4. How I think about cache • imagine a giant dict storing all your data • you have to manage all data manually • or provide some automated behaviour czwartek, 22 lipca 2010
  • 5. similar to.... • manual memory managment in c • cache is memory • and you have to controll it manually czwartek, 22 lipca 2010
  • 6. profits • improved performance • ...? czwartek, 22 lipca 2010
  • 7. problems • managing any type of memory is hard • automation often have to be done custom each time czwartek, 22 lipca 2010
  • 10. • very old pattern (circa 1968) • we own the name to Donald Mitchie czwartek, 22 lipca 2010
  • 11. how it works • we assosciate input with output, and store in somewhere • based on the assumption that for a given input, output is always the same czwartek, 22 lipca 2010
  • 12. code example CACHE_DICT = {} def cached(key): def func_wrapper(func): def arg_wrapper(*args, **kwargs): if not key in CACHE_DICT: value = func(*args, **kwargs) CACHE_DICT[key] = value return CACHE_DICT[key] return arg_wrapper return func_wrapper czwartek, 22 lipca 2010
  • 13. what if output can change? • our pattern is still usefull • we simply need to add something czwartek, 22 lipca 2010
  • 15. There are only two hard problems in Computer Science: cache invalidation and naming things Phil Karlton czwartek, 22 lipca 2010
  • 16. • basically, we update data in cache • we need to know when and what to change • the more granular you want to be, the harder it gets czwartek, 22 lipca 2010
  • 17. code example def invalidate(key): try: del CACHE_DICT[key] except KeyError: print "someone tried to invalidate not present key: %s" %key czwartek, 22 lipca 2010
  • 19. invalidating too much/ not enough • flushing all data any time something changes • not flushing cache at all • tragic effects czwartek, 22 lipca 2010
  • 20. @cached('key1') def simple_function1(): return db_get(id=1) @cached('key2') def simple_function2(): return db_get(id=2) # SUPPOSE THIS IS IN ANOTHER MODULE @cached('big_key1') def some_bigger_function(): """ this function depends on big_key1, key1 and key2 """ def inner_workings(): db_set(1, 'something totally new') ####### ## imagine 100 lines of code here :) ###### inner_workings() return [simple_function1(),simple_function2()] if __name__ == '__main__': simple_function1() simple_function2() a,b = some_bigger_function() assert a == db_get(id=1), "this fails because we didn't invalidated cache properly" czwartek, 22 lipca 2010
  • 21. invalidating too soon/ too late • your cache have to be synchronised to you db • sometimes very hard to spot • leads to tragic mistakes czwartek, 22 lipca 2010
  • 22. @cached('key1') def simple_function1(): return db_get(id=1) @cached('key2') def simple_function2(): return db_get(id=2) # SUPPOSE THIS IS IN ANOTHER MODULE def some_bigger_function(): db_set(1, 'something') value = simple_function1() db_set(2, 'something else') #### now we know we used 2 cached functions so.... invalidate('key1') invalidate('key2') #### now we know we are safe, but for a price return simple_function2() if __name__ == '__main__': some_bigger_function() czwartek, 22 lipca 2010
  • 23. superposition of dependancy • somehow less obvious problem • eventually you will start caching effects of computation • you have to know very preciselly of what your data is dependant czwartek, 22 lipca 2010
  • 24. @cached('key1') def simple_function1(): return db_get(id=1) @cached('key2') def simple_function2(): return db_get(id=2) # SUPPOSE THIS IS IN ANOTHER MODULE @cached('key') def some_bigger_function(): return { '1': simple_function1(), '2': simple_function2(), '3': db_get(id=3) } if __name__ == '__main__': simple_function1() # somewhere else db_set(1, 'foobar') # and again db_set(3, 'bazbar') invalidate('key') # ooops, we forgot something data = some_bigger_function() assert data['1'] == db_get(id=1), "this fails because we didn't manage to invalidate all the keys" czwartek, 22 lipca 2010
  • 25. summing up • know your data.... • be aware what and when you cache • take care when using cached data in computation czwartek, 22 lipca 2010
  • 28. why? • very fast access • simple to implement • very effective as long as you’re using single process czwartek, 22 lipca 2010
  • 29. clever tricks with dicts czwartek, 22 lipca 2010
  • 30. code example CACHE_DICT = {} def cached(key): def func_wrapper(func): def arg_wrapper(*args, **kwargs): if not key in CACHE_DICT: value = func(*args, **kwargs) CACHE_DICT[key] = value return CACHE_DICT[key] return arg_wrapper return func_wrapper czwartek, 22 lipca 2010
  • 32. code example def invalidate(key): try: del CACHE_DICT[key] except KeyError: print "someone tried to invalidate not present key: %s" %key czwartek, 22 lipca 2010
  • 35. • battle tested • scales • fast • supports a few cool features • behaves a lot like dict • supports time-based expiration czwartek, 22 lipca 2010
  • 36. libraries? • python-memcache • python-libmemcache • python-cmemcache • pylibmc czwartek, 22 lipca 2010
  • 37. why no benchmarks • not the point of this talk :) • benchmarks are generic, caching is specific • pick your flavour, think for yourself czwartek, 22 lipca 2010
  • 38. code example cache = memcache.Client(['localhost:11211']) def memcached(key): def func_wrapper(func): def arg_wrapper(*args, **kwargs): value = cache.get(str(key)) if not value: value = func(*args, **kwargs) cache.set(str(key), value) return value return arg_wrapper return func_wrapper czwartek, 22 lipca 2010
  • 40. code example def mem_invalidate(key): cache.set(str(key), None) czwartek, 22 lipca 2010
  • 42. • what if I don’t want to expire each key manually • that’s a lot to remember • and we have to be carefull :( czwartek, 22 lipca 2010
  • 43. groups? • group keys into sets • which are tied to one key per set • expire one key, instead of twenty czwartek, 22 lipca 2010
  • 44. how to get there? • store some extra data • you can store dicts in cache • and cache behaves like dict • so it’s a case of comparing keys and values czwartek, 22 lipca 2010
  • 45. #we start with specified key and group key='some_key' group='some_group' # now retrieve some data from memcached data=memcached_client.get_multi(key, group) # now data is a dict that should look like #{'some_key' :{'group_key' : '1234', # 'value' : 'some_value' }, # 'some_group' : '1234'} # if data and (key in data) and (group in data): if data[key]['group_key']==data[group]: return data[key]['value'] czwartek, 22 lipca 2010
  • 46. def cached(key, group_key='', exp_time=0 ): # we don't want to mix time based and event based expiration models if group_key : assert exp_time==0, "can't set expiration time for grouped keys" def f_wrapper(func): def arg_wrapper(*args, **kwargs): value = None if group_key: data = cache.get_multi([tools.make_key(group_key)]+[tools.make_key(key)]) data_dict = data.get(tools.make_key(key)) if data_dict: value = data_dict['value'] group_value = data_dict['group_value'] if group_value != data[tools.make_key(group_key)]: value = None else: value = cache.get(key) if not value: value = func(*args, **kwargs) if exp_time: cache.set(tools.make_key(key), value, exp_time) elif not group_key: cache.set(tools.make_key(key), value) else: # exp_time not set and we have group_keys group_value = make_group_value(group_key) data_dict = { 'value':value, 'group_value': group_value} cache.set_multi({ tools.make_key(key):data_dict, tools.make_key(group_key):group_value }) return value arg_wrapper.__name__ = func.__name__ return arg_wrapper return f_wrapper czwartek, 22 lipca 2010
  • 48. code samples @ http://github.com/ mdomans/europython2010 czwartek, 22 lipca 2010
  • 49. follow me twitter: mdomans blog: blog.mdomans.com czwartek, 22 lipca 2010