SlideShare utilise les cookies pour améliorer les fonctionnalités et les performances, et également pour vous montrer des publicités pertinentes. Si vous continuez à naviguer sur ce site, vous acceptez l’utilisation de cookies. Consultez nos Conditions d’utilisation et notre Politique de confidentialité.
SlideShare utilise les cookies pour améliorer les fonctionnalités et les performances, et également pour vous montrer des publicités pertinentes. Si vous continuez à naviguer sur ce site, vous acceptez l’utilisation de cookies. Consultez notre Politique de confidentialité et nos Conditions d’utilisation pour en savoir plus.
Hybrid Storage Pools (Now with the benefit of hindsight!)
Hybrid Storage Pools Using Disk and Flash with ZFS (Now with the benefit of hindsight!)Adam Leventhal @ahl
Flash Emerges • Storage medium invented in 1980 – Very fast reads (~50us) – Fast writes (~300us) – High IOPS / low latency – Limited number of write cycles • 2004: ﬂash cost as much as DRAM • 2007: ﬂash cost was right between DRAM and disk
Disk is dead… just like tape • Many predicted the death of disk or relegaSon of disk to backup • Didn’t happen • All-‐ﬂash soluSons sSll trying to gain mass adopSon
ZFS circa 2007 • Sun was developing a ZFS-‐based storage appliance (Fishworks) • ZFS: enterprise class storage on commodity hardware • Problem: enterprise storage was a lot faster • Looked at tradiSonal soluSons – NV-‐DRAM to accelerate writes – Massive DRAM to cache reads • But it was just the right Sme for ﬂash…
Hybrid Storage Pool (HSP) • Use ﬂash as a storage Ser • Between DRAM and disk in cost, capacity, latency, throughput • Use commodity disks – 7200 RPM – Good throughput – Great $/GB and wa_s/GB • Combine disk, ﬂash, DRAM into a hybrid pool • In ZFS: – ZFS intent log (ZIL) for write acceleraSon – L2ARC to extend the reach of the ZFS cache
ZFS Caching • AdapSve Replacement Cache (ARC) as the primary DRAM cache • L2ARC developed by Brendan Gregg to use external (ﬂash) devices • Takes into account opSmal IO pa_erns for ﬂash – Random, small writes = hastened failure – SequenSal, large writes = happy SSDs – Thro_les writes to preserve longevity • Uses predicSve evicSon to idenSfy blocks to cache
L2ARC Problems • Non-‐persistent – Aeer a reboot or fatal system failure, the cache is empty • Slow to warm up – Will only write to one device at a Sme -‐> best case 1TB / hour – Real world example 2TB in 24 hours • Conceptually most of the way there • No real way to tune it to a workload • Not much real-‐world tesSng and tuning done
Changing Landscape • DRAM prices have dropped dramaScally • Large memory systems available (3TB+) • NAND ﬂash is geing trickier to build around • Endurance and performance decrease as lithography and price decrease – MLC and “TLC” (volume ﬂash) have parScularly short lives • Running into size limitaSons – 32nm in 2008 – 19nm today – Supposed ﬂoor around 11nm • SSDs are becoming increasingly complex
What to do today? • The L2ARC can help – The SSD space is large and highly varied – Generally cheap, laptop SSDs suﬃce for the L2ARC – Give it enough Sme to warm up (hours or days) – Measure the impact on your actual workload • The ARC is great and relaSvely simple – Load up on DRAM
Next for ZFS • For the L2ARC to be viable, it needs to be persistent • Lots of performance work needed – Run it through a bunch of real-‐world use cases – Make it easy to collect coherent, relevant data – Create the right knobs for users to turn • There are a few companies using the L2ARC • Hopefully they will take up the mantle