Ce diaporama a bien été signalé.
Nous utilisons votre profil LinkedIn et vos données d’activité pour vous proposer des publicités personnalisées et pertinentes. Vous pouvez changer vos préférences de publicités à tout moment.

Efficient Immutable Data Structures (Okasaki for Dummies)

A talk I did at Intertrust on September 18, 2015.

I present some core concepts from functional programming and show how the work done by Chris Okasaki and others on efficient immutable data structures has made it practical to use functional techniques in production programs.

Efficient Immutable Data Structures (Okasaki for Dummies)

  1. 1. How Efficient Immutable Data Enables Functional Programming
  2. 2. How Efficient Immutable Data Enables Functional Programming or Okasaki For Dummies
  3. 3. 3 SEPTEMBER 2015 Who Am I?
  4. 4. 4 SEPTEMBER 2015 Tom Faulhaber ➡ Planet OS CTO ➡ Background in networking, Unix OS, visualization, video ➡ Currently working mostly in “Big Data” ➡ Contributor to the Clojure programming language
  5. 5. 5 SEPTEMBER 2015 Who Are YOU?
  6. 6. 6 SEPTEMBER 2015 What is functional programming?
  7. 7. 7 SEPTEMBER 2015
  8. 8. 8 SEPTEMBER 2015 y = f(x) Pure Functions:
  9. 9. 9 SEPTEMBER 2015 y = f(x) Pure Functions: y = f(x)
  10. 10. 10 SEPTEMBER 2015 y = f(x) Pure Functions: y = f(x)y = f(x) Not modified Not shared
  11. 11. 11 SEPTEMBER 2015 Higher-order Functions: map(f, [x1, x2, ..., xn]) ! [f(x1), f(x2), ..., f(xn)]
  12. 12. 12 SEPTEMBER 2015 Higher-order Functions: g = map(f) Result is a new function
  13. 13. 13 SEPTEMBER 2015 Higher-order Functions: g = map f
  14. 14. 14 SEPTEMBER 2015 Other Aspects: ➡Type inference ➡Laziness
  15. 15. 15 SEPTEMBER 2015 Functional is the opposite of Object-oriented
  16. 16. 16 SEPTEMBER 2015 State is managed through encapsulation Object-oriented: State is avoided altogether Functional:
  17. 17. 17 SEPTEMBER 2015 Why functional?
  18. 18. 18 SEPTEMBER 2015 Why functional? ➡ No shared state makes it easier to reason about programs ➡ Concurrency problems simply go away (almost!) ➡ Undo and backtracking are trivial ➡ Algorithms are often more elegant It is better to have 100 functions operate on one data structure than 10 functions on 10 data structures. - Alan Perlis
  19. 19. 19 SEPTEMBER 2015 Why functional? A host of new languages support the functional model: - ML, Haskell, Clojure, Scala, Idris - All with different degrees of purity
  20. 20. 20 SEPTEMBER 2015 There’s a catch!
  21. 21. 21 SEPTEMBER 2015 There’s a catch! f(5) This is cheap:
  22. 22. 22 SEPTEMBER 2015 There’s a catch! f({"type": "object", "properties": { "mesos": { "description": "Mesos specific configuration properties", "type": "object", "properties": { "master": { … } … } … } … } … }) But this is expensive:
  23. 23. 23 SEPTEMBER 2015 There’s a catch! f(<my whole database>) And this is crazy:
  24. 24. 24 SEPTEMBER 2015 Persistent Data Structures to the Rescue
  25. 25. 25 SEPTEMBER 2015 Persistent Data Structures The goal: Approximate the performance of mutable data structures: CPU and memory. The big secret: Use structural sharing! There are lots of little secrets, too. We won’t cover them today.
  26. 26. 26 SEPTEMBER 2015 Persistent Data Structures - History 1990 2000 2010 Persistant Arrays (Dietz) ML Language (1973) Catenable Queues (Buchsbaum/ Tarjan) Okasaki Haskell Language Clojure CollectionsFinger Trees (1977) Zipper (Huet) Data.Map in Haskell Priority Search Queues (Hinze) Fast And Space Efficient Trie Searches (Bagwell) Ideal Hash Trees (Bagwell) RRB Trees (Bagwell/ Rompf)
  27. 27. 27 SEPTEMBER 2015 The quick brown dog jumps over 6 Example: Vector ➡ In Java/C# ArrayList; in C++ std::vector. ➡ A list with constant access and update and amortized constant append. The quick brown fox jumps over 6 a[3] =“dog”dog
  28. 28. 28 SEPTEMBER 2015 Example: Vector ➡ In Java/C# ArrayList; in C++ std::vector. ➡ A list with constant access and update and amortized constant append. The quick brown dog jumps over 6 a.push_back(“the”) The quick brown dog jumps over 7 the the The quick brown dog jumps over 7 the
  29. 29. 29 SEPTEMBER 2015 Example: Vector ➡ To build a persistent vector, we start with a tree: Persistent ^ depth = dlog ne Data is in the leaves 6 The quick brown fox jumps over
  30. 30. 30 SEPTEMBER 2015 The quick brown fox jumps over 6 0 1 2 3 4 5 000 001 010 011 100 101 LLL LLR LRL LRR RLL RLR The quick brown fox jumps over 6 0 1 2 3 4 5 000 001 010 011 100 101 LLL LLR LRL LRR RLL RLR The quick brown fox jumps over 6 0 1 2 3 4 5 000 001 010 011 100 101 LLL LLR LRL LRR RLL RLR x = a[3] The quick brown fox jumps over 6 0 1 2 3 4 5 000 001 010 011 100 101 LLL LLR LRL LRR RLL RLR The quick brown fox jumps over 6 0 1 2 3 4 5 000 001 010 011 100 101 LLL LLR LRL LRR RLL RLR
  31. 31. 31 SEPTEMBER 2015 The quick brown fox jumps over 6 7 The quick brown fox jumps over 6 7 The quick brown fox jumps over 6 7 The quick brown fox jumps over 6 b = a.add(“the”) 7 The quick brown fox jumps over 6 the
  32. 32. 32 SEPTEMBER 2015 7 The quick brown fox jumps over the
  33. 33. 33 SEPTEMBER 2015 The quick brown fox jumps over 6
  34. 34. 34 SEPTEMBER 2015 7 The quick brown fox jumps over 6 the
  35. 35. 35 SEPTEMBER 2015 But, wait…
  36. 36. 36 SEPTEMBER 2015 But, wait… O(1) 6= O(log n) This isn’t what you promised!
  37. 37. 37 SEPTEMBER 2015 2 4 6 8 10 0 250 500 750 1000 Number of elements Treedepth 2 4 6 8 10 0 250 500 750 1000 Number of elements Treedepth 2 4 6 8 10 0 250 500 750 1000 Number of elements Treedepth d = 1 d = dlog2 ne
  38. 38. 38 SEPTEMBER 2015 The answer: Use 32-way trees
  39. 39. 39 SEPTEMBER 2015 x = a[7022896]x = a[7022896] 00110 10110 01010 01001 10000 6 22 10 9 16
  40. 40. 40 SEPTEMBER 2015 6 apple 22 10 9 16
  41. 41. 41 SEPTEMBER 2015 O(1) ' O(log32 n)
  42. 42. 42 SEPTEMBER 2015 2 4 6 8 10 0 250 500 750 1000 Number of elements Treedepth d = 1 d = dlog2 ne 2 4 6 8 10 0 250 500 750 1000 Number of elements Treedepth d = dlog32 ne
  43. 43. 43 SEPTEMBER 2015 Example: Tree Walking ➡ The functional equivalent of the visitor pattern
  44. 44. 44 SEPTEMBER 2015 Clojure code to implement the walker: (postwalk (fn [node] (if (= :blue (:color node)) (assoc node :color :green) node)) tree) Example: Tree Walking
  45. 45. 45 SEPTEMBER 2015 Example: Zippers ➡ Allow you to navigate and update a tree across many operations by “unzipping” it.
  46. 46. 46 SEPTEMBER 2015 Takeaways ➡ Functional data structures can approximate the performance of mutable data structures, but will usually won’t be quite as fast. ➡ … but not having to do state management often wins back the difference ➡ We need to choose data structures carefully depending on how they’re going to be used. ➡ This doesn’t solve shared state, just reduces it. (but see message passing, software transactional memory, etc.)
  47. 47. 47 SEPTEMBER 2015 References Chris Okasaki, Purely Functional Data Structures, Doctoral dissertation, Carnegie Mellon University, 1996. Rich Hickey, “Are We There Yet?” Presentation at the JVM Languages SUmmit, 2009. http://www.infoq.com/ presentations/Are-We-There-Yet-Rich-Hickey Gerard Huet, "Functional Pearl: The Zipper". Journal of Functional Programming 7 (5): 549–554. doi:10.1017/ s0956796897002864 Jean Niklas L’orange, “Understanding Clojure's Persistent Vectors” Blog post at http://hypirion.com/musings/ understanding-persistent-vector-pt-1.
  48. 48. 48 SEPTEMBER 2015 Discussion

×