Learn about Hitchhiker Trees from David Greenberg, a new functional, immutable, persistent variation of a fractal tree. In these slides, we'll learn how to understand immutable data strucutres and a variety of trees, introducing new concepts as we build up to the hitchhiker tree.
16. Philosophy of Identity
Q: When isn’t an apple an apple?
A: When an apple points to an orange points to a banana
isn’t an apple points to an orange points to a mango.
28. B Trees are Optimal for
Reads
Lower Bound of logB(n) for sorted lookups
Controlling the base of the logarithm is awesome
log2(1000) = 9.96
log5(1000) = 4.29
log100(1000) = 1.5
Going wide gives big constant speedups for free
Under our I/O cost model
53. Flush Control
Total I/O I/O per Flush Avg I/O per
Insert
B+ Tree 21 3 3
Fractal Tree 12 1 to 4 1.7
Hitchhiker Tree 5 5 0.7
54. Real Branching Factors
B+ Trees have fan out of 1000-2000
Hitchhiker Trees have fan out of 100-200
But Hitchhiker Tree buffers hold 900-1000
elements!
60. Outboard
Looks like a hash map
Data stored off-heap in Redis
Functional data structures mean free snapshots
After a VM restart, just reconnect to Redis
Lifetime of in-memory data doesn’t need to be
tied to lifetime of runtime memory
61. What’ll we build next?
Q&A
Thanks to:
Andy Chambers for JDBC Backend &
GC Improvements
Casey Marshall for S3 Backend
63. (Hash) Array Mapped Tries
We add the fat node trick from B trees
We hash keys first for even distribution
No need to store full hash: prefix is enough
Notes de l'éditeur
Author, engineer, now consultant working on Mesos & dist sys
Book signing at lunch today!
Unfortunately, we’ll be a sad panda
By copying the list, we get to be a happy panda
explain the color scheme
Introduce concept
Remember this example? Let’s improve it
something about segmented lists & their tradeoffs
We’re going to talk about Binary Search Trees, B trees, B+ trees, Fractal trees, and Hitchhiker trees
Note: sorted
Binary = 2 children per node
CLRS book for algorithm examples
Tries are out of scope for this talk, but they’re how Scala, Clojure, and Elixir implement maps
^^^Cool hashing tricks, if we have time at the end
B stands for “branching factor”
Even a fractal tree needs functional data structure for projection hypothetical
If we scan, we get out of order & duplicated values