Ce diaporama a bien été signalé.
Nous utilisons votre profil LinkedIn et vos données d’activité pour vous proposer des publicités personnalisées et pertinentes. Vous pouvez changer vos préférences de publicités à tout moment.

Memory Management In Python The Basics

27 758 vues

Publié le

Memory Management In Python The Basics

Publié dans : Ingénierie
  • Hi Nina, recently i was going through your Pycon video in youtube explaining the memory management. I have one question, when we say x = 300 and then y = 300, how does the newly created variable points to the existing memory location and makes the reference count increment. Do we have any table to map this that as and when a new variable is created; it checks whether the value exist in any memory location or not
       Répondre 
    Voulez-vous vraiment ?  Oui  Non
    Votre message apparaîtra ici
  • @Vallury Phaniram x = 300; y = 300 NEXTLINE: id(x) == id(y) NEXTLINE: x is y (Sorry for the format, slideshare strips characters)
       Répondre 
    Voulez-vous vraiment ?  Oui  Non
    Votre message apparaîtra ici
  • @Vallury Phaniram They will be different if you try this example in an interactive interpreter/REPL environment. See the warning at the beginning of the slides. If you're using the REPL, Try the following:
       Répondre 
    Voulez-vous vraiment ?  Oui  Non
    Votre message apparaîtra ici
  • @Russell Parks Missed that one! Thank you! They were node1 and node2 originally, but I decided that left/right would be clearer.
       Répondre 
    Voulez-vous vraiment ?  Oui  Non
    Votre message apparaîtra ici
  • slide 24 has an error. id of x and y will be different. "x = 300, y = x". only now id(x) and id(y) will be same and x is y will return true.
       Répondre 
    Voulez-vous vraiment ?  Oui  Non
    Votre message apparaîtra ici

Memory Management In Python The Basics

  1. 1. BASICS OF MEMORY MANAGEMENT IN PYTHON Nina Zakharenko
  2. 2. WHY SHOULD YOU CARE? Knowing about memory management helps you write more efficient code.
  3. 3. WHAT WILL YOU GET? ∎Vocabulary ∎Basic Concepts ∎Foundation
  4. 4. WHAT WON’T YOU GET? You won’t be an expert at the end of this talk.
  5. 5. WHAT’S A VARIABLE?
  6. 6. What’s a C-style variable? Memory variable location Value a 0x3E8 101 b 0x3E9 101 These values live in a fixed size bucket. Can only hold same-sized data, or an overflow occurs.
  7. 7. What’s a C-style variable? Memory location Value 0x3E8 101 0x3E9 101 Later… 110 The data in this memory location is overwritten.
  8. 8. PYTHON HAS NAMES, NOT VARIABLES
  9. 9. How are python objects stored in memory? names references objects
  10. 10. A name is just a label for an object. In python, each object can have lots of names.
  11. 11. Simple • numbers • strings Different Types of Objects Containers •dict •list • user defined- classes
  12. 12. What is a reference? A name or a container object pointing at another object.
  13. 13. What is a reference count?
  14. 14. How can we increase the ref count? 300x = 300 x references: 1 +1
  15. 15. How can we increase the ref count? 300 x = 300 y = 300 x references: 2 y +1
  16. 16. How can we increase the ref count? 300 z = [300, 300] x references: 4 y
  17. 17. Decrease Ref Count - del 300 x = 300 y = 300 del x references: 1 yx
  18. 18. What does del do? The del statement doesn’t delete objects. It: • removes that name as a reference to that object • reduces the ref count by 1
  19. 19. Decrease Ref Count - Change Reference x = 300 y = 300 300 references:0 yy = None
  20. 20. Decrease Ref Count - Going out of Scope def print_word(): word = 'Seven' print('Word is ' + word) ref count +1 ‘seven’ is out of scope. ref count -1 print_word()
  21. 21. local vs. global namespace ■If refcounts decrease when an object goes out of scope, what happens to objects in the global namespace? ■Never go out of scope! Refcount never reaches 0. ■Avoid putting large or complex objects in the global namespace.
  22. 22. Every python object holds 3 things ∎Its type ∎Its value ∎A reference count
  23. 23. PyObject type integer refcount 2 value 300 Names References x y
  24. 24. x = 300 y = 300 print( id(x) ) > 28501818 print( id(y) ) > 28501818 print x is y > True * don’t try this in an interactive environment (REPL)
  25. 25. GARBAGE COLLECTION
  26. 26. What is Garbage Collection? A way for a program to automatically release memory when the object taking up that space is no longer in use.
  27. 27. Two Main Types of Garbage Collection Reference Counting Tracing
  28. 28. How does reference counting garbage collection work? Add and Remove References Refcount Reaches 0 Cascading Effect
  29. 29. The Good • Easy to Implement • When refcount is 0, objects are immediately deleted. Reference Counting Garbage Collection The Bad • space overhead - reference count is stored for every object • execution overhead - reference count changed on every assignment
  30. 30. The Ugly • Not generally thread safe • Reference counting doesn’t detect cyclical references Reference Counting Garbage Collection
  31. 31. Cyclical References By Example class Node: def __init__(self, value): self.value = value def next(self, next): self.next = next
  32. 32. What’s a cyclical reference? left right root rc = 1 rc = 3 rc = 2 root = Node('root') left = Node('left') right = Node(‘right') root.next(left) left.next(right) right.next(left)
  33. 33. What’s a cyclical reference? del root del node1 del node2 left right root rc = 0 rc = 1 rc = 1
  34. 34. Reference counting alone will not garbage collect objects with cyclical references.
  35. 35. Two Main Types of Garbage Collection Reference Counting Tracing
  36. 36. Tracing Garbage Collection ■source: http://webappguru.blogspot.com/2015/11/mark-and-sweep-garbage-collection.html
  37. 37. Tracing Garbage Collection ■source: http://webappguru.blogspot.com/2015/11/mark-and-sweep-garbage-collection.html
  38. 38. What does Python use? Reference Counting Generational+
  39. 39. Generational Garbage Collection is based on the theory that most objects die young. ■ source: http://cs.ucsb.edu/~ckrintz/racelab/gc/papers/hoelzle-jvm98.pdf
  40. 40. Python maintains a list of every object created as a program is run. Actually, it makes 3. generation 0 generation 1 generation 2 Newly created objects are stored in generation 0.
  41. 41. Only container objects with a refcount greater than 0 will be stored in a generation list.
  42. 42. When the number of objects in a generation reaches a threshold, python runs a garbage collection algorithm on that generation, and any generations younger than it.
  43. 43. What happens during a generational garbage collection cycle? Python makes a list for objects to discard. It runs an algorithm to detect reference cycles. If an object has no outside references, it’s put on the discard list. When the cycle is done, it frees up the objects on the discard list.
  44. 44. After a garbage collection cycle, objects that survived will be promoted to the next generation. Objects in the last generation (2) stay there as the program executes.
  45. 45. When the ref count reaches 0, you get immediate clean up. If you have a cycle, you need to wait for garbage collection.
  46. 46. REFERENCE COUNTING GOTCHAS
  47. 47. Reference counting is not generally thread-safe. We’ll see why this is a big deal™ later.
  48. 48. Remember our cycle from before? left rightrc = 1 rc = 1 Cyclical references get cleaned up by generational garbage collection.
  49. 49. Cyclical Reference Cleanup Except in python2 if they have a __del__ method. **fixed in python 3.4! - https://www.python.org/dev/peps/pep-0442/ Gotcha!
  50. 50. The __del__ magic method ■ Sometimes called a “destructor” ■Not the del statement. ■ Runs before an object is removed from memory
  51. 51. __slots__
  52. 52. What are __slots__? class Dog(object): pass buddy = Dog() buddy.name = 'Buddy' print(buddy.__dict__) {'name': 'Buddy'}
  53. 53. What are __slots__? 'Pug'.name = 'Fred' AttributeError Traceback (most recent call last) ----> 1 'Pug'.name = 'Fred' AttributeError: 'str' object has no attribute 'name'
  54. 54. class Point(object): __slots__ = ('x', 'y') What are __slots__? What is the type of __slots__? point.name = "Fred" Traceback (most recent call last): File "point.py", line 8, in <module> point.name = "Fred" AttributeError: 'Point' object has no attribute 'name' point = Point() point.x = 5 point.y = 7
  55. 55. size of dict vs. size of tuple import sys sys.getsizeof(dict()) sys.getsizeof(tuple()) sizeof dict: 288 bytes sizeof tuple: 48 bytes
  56. 56. When would we want to use __slots__? ■ If we’re going to be creating many instances of a class ■If we know in advance what properties the class should have
  57. 57. WHAT’S A GIL?
  58. 58. GLOBAL INTERPETER LOCK
  59. 59. Only one thread can run in the interpreter at a time.
  60. 60. Upside Fast & Simple Garbage Collection Advantages / Disadvantages of a GIL Downside In a python program, no matter how many threads exist, only one thread will be executed at a time.
  61. 61. ■Use multi-processing instead of multi- threading. ■Each process will have it’s own GIL, it’s on the developer to figure out a way to share information between processes. Want to take advantage of multiple CPUs?
  62. 62. If the GIL limits us, can’t we just remove it? additional reading: https://docs.python.org/3/faq/library.html#can-t-we-get-rid-of-the-global-interpreter-lock
  63. 63. For better or for worse, the GIL is here to stay!
  64. 64. WHAT DID WE LEARN?
  65. 65. Garbage collection is pretty good.
  66. 66. Now you know how memory is managed.
  67. 67. Consider python3
  68. 68. Or, for scientific applications numpy & pandas.
  69. 69. Thanks! @nnja nina.writes.code@gmail.com [TODO SLIDESHARE LINK]
  70. 70. Bonus Material
  71. 71. Additional Reading • Great explanation of generational garbage collection and python’s reference detection algorithm. • https://www.quora.com/How-does-garbage-collection-in-Python- work • Weak Reference Documentation • https://docs.python.org/3/library/weakref.html • Python Module of the Week - gc • https://pymotw.com/2/gc/ • PyPy STM - GIL less Python Interpreter • http://morepypy.blogspot.com/2015/03/pypy-stm-251- released.html • Saving 9GB of RAM with python’s __slots__ • http://tech.oyster.com/save-ram-with-python-slots/
  72. 72. Getting in-depth with the GIL • Dave Beazley - Guide on how the GIL Operates • http://www.dabeaz.com/python/GIL.pdf • Dave Beazley - New GIL in Python 3.2 • http://www.dabeaz.com/python/NewGIL.pdf • Dave Beazley - Inside Look at Infamous GIL Patch • http://dabeaz.blogspot.com/2011/08/inside-look-at-gil- removal-patch-of.html
  73. 73. Why can’t we use the REPL to follow along at home? • Because It doesn’t behave like a typical python program that’s being executed. • Further reading: http://stackoverflow.com/questions/ 25281892/weird-id-result-on-cpython-intobject PYTHON PRE-LOADS OBJECTS • Many objects are loaded by Python as the interpreter starts. • Called peephole optimization. • Numbers: -5 -> 256 • Single Letter Strings • Common Exceptions • Further reading: http://akaptur.com/blog/2014/08/02/ the-cpython-peephole-optimizer-and-you/
  74. 74. Common Question - Why doesn’t python a python program shrink in memory after garbage collection? • The freed memory is fragmented. • i.e. It’s not freed in one continuous block. • When we say memory is freed during garbage collection, it’s released back to python to use for other objects, and not necessarily to the system. • After garbage collection, the size of the python program likely won’t go down.
  75. 75. PyListObject type list refcount 1 value size 3 capacity 10 nums Value -10 refcount 1 type integer PyObject Value -9 refcount 2 type integer PyObject How does python store container objects?
  76. 76. Credits Big thanks to: • Faris Chebib & The Salt Lake City Python Meetup • The many friends & co-workers who lent me their eyes & ears, particularly Steve Holden Special thanks to all the people who made and released these awesome resources for free: ■ Presentation template by SlidesCarnival ■ Photographs by Unsplash ■ Icons by iconsdb

×