1. State of the Art Thin Provisioning Stephen Foskett stephen@fosketts.net twitter.com/sfoskett
2. Storage Is Supposed To Be Getting Cheaper! Disk cost is dropping rapidly $250 buys: 1994: 2 GB 1999: 20 GB 2004: 200 GB 2009: 2000 GB But enterprise storage costs keep rising! 2
3. Where Is The Cost? Hardware and software make up a small percentage of total enterprise storage spending… …and hard disk drive capacity makes up a small percentage of that! Data center/environmental, administrative personnel, maintenance, and data protection are much bigger The biggest opportunity is inefficiency, but this has always been hard to tackle 3
4. Over-Allocation and Under-Utilization 4 Raw Disk Capacity Purchased Conventional storage provisioning is grossly inefficient Usable Protected Storage Capacity Allocated to Servers Requested Capacity Used by Files Required Capacity
5. Thin Provisioning Simplified! 5 Traditional storage provisioning Thin storage provisioning Allocated but unused Free for allocation Actually Used Used
6. Thin Provisioning: Potentially Problematic Storage is commonly over-allocated to servers Some arrays can “thinly” provision just the capacity that actually contains data 500 GB request for new project, but only 2 GB of initial data is written – array only allocates 2 GB and expands as data is written What’s not to love? Oops – we provisioned a petabyte and ran out of storage Chunk sizes and formatting conflicts Can it thin unprovision? Can it replicate to and from thin provisioned volumes?
8. Ever Play the “Telephone” Game? Application IV File/Record Layer File System Database III Each layer obscures the ones above and below it IIc Block Aggregation Host IIb Network Device IIa Storage Devices I SNIA Shared Storage Model
9. File System It’s (Relatively) Easy to Allocate on Write 9 As applications write data Storage Capacity is allocated File system write requests pass through to storage systems so they can wait to allocate as requested
10. File System But What About De-Allocate on Delete? 10 Data is deleted Storage Capacity is freed up Most file systems don’t send a consistent “de-allocate” message to storage so many thin systems get fatter over time
12. Server Smarts: Metadata Monitoring File system/VM combos can handle thin provisioning on their own ZFS, Veritas Volume Manager, VMware VMFS Arrays can “watch” an operating system allocate and de-allocate storage Perilous! Known file systems and volume formats only! Data Robotics Drobo supports FAT32, NTFS, HFS+ 12 Drobo watches the file allocation table for deletes File System Storage
13. Storage Smarts: Zero Page Reclaim Storage arrays watch for “pages” containing all zeros and simply don’t write them IBM XIV, 3PAR, NetApp (with dedupe), HDS, EMC V-Max Some storage vendors rely on utilities to reclaim NetApp SnapDrive for Windows 5.0 Compellent Free Space Recovery Veritas Storage Foundation Thin Reclamation Can also force it with sdelete 13
14. Zero Page Reclaim: Pros and Cons Pro: Straightforward to implement in storage Some implementation: VMware eagerzeroedthick Con: Requires application/OS/file system to actually have written all zeroes - most just ignore unused space rather than zeroing Most implementations are page-based Drives more I/O VMware thin/thick don’t work 14
15. The Lingo: WRITE_SAME Facilitates zero page reclaim “Write this block 1,000,000 times” Pro: Conserves I/O operations Popular with array vendors Exists and is even implemented (a little) Con: Depends on file system layer intelligence Still introduces extra I/O Could be very, very bad in a thin-unaware array 15
16. The Bridge: Veritas Thin API Thin Reclamation API can communicate de-allocation to arrays by zeroing using WRITE_SAME/UNMAP Introduced in 5.0 (UNIX) and 5.1 (Windows) Supports 3PAR, EMC CLARiiON CX4, HDS USPV/VM, HP XP20k/24k, IBM XIV Will also support Compellent, EMC Symmetrix DMX, Fujitsu Eternus, HP EVA, HDS AMS, IBM DS8k, NetApp SmartMove copies only allocated blocks Supports any/all storage systems Works with thin-capable arrays Speeds up migrations in all cases 16
17. What About TRIM? TRIM (ATA) and TRIM/UNMAP/PUNCH (SCSI) can inform storage that a block is no longer needed Designed for SSD architecture: Cells grouped into 4 kB pages and 512 kB blocks Only empty pages can be written to Writing to empty pages is quick! Writing to used pages requires a block erase Read-erase-write is slow(er) OS support for TRIM: Windows 7 & Server 2008 R2 Linux 2.6.33, Open Solaris, FreeBSD 9 17
18. TRIM Isn’t For Thin Not really a thin-provisioning command but could play one on TV NetApp proposed a hole punching standard to INCITS T10 committee HDS and EMC prefer UNMAP bit A similar NetApp approach uses NFS and a Windows file system redirect
24. Stephen’s Dream Thin provisioning could be awesome, provided it is integrated at all levels of the stack Smart applications that don’t spew data everywhere Smart file systems and volume managers that communicate what is and isn’t used Smart virtualization layers that don’t obscure usage Smart storage systems that act on all of this information with granularity and without falling over dead Smart monitoring systems to tie everything together and head off disaster