SlideShare une entreprise Scribd logo
1  sur  14
Télécharger pour lire hors ligne
Why 4K?
     October 2, 2012
      George Wilson
george.wilson@delphix.com
Why 4K? (Not Y4K)
●   This is not the next Millennium bug!




                         Delphix Proprietary and Confidential
Storage History
●   1998 IBM publishes a paper proposing an increase of disk
    sector size from 512B to 4K
●   2000 4K IDEMA (International Disk Drive Equipment and
    Materials Association) committee was formed
●   2005 ZFS released in OpenSolaris with support for block sizes
    ranging from 512B to 128K
●   2005 512B emulation mode proposed, later known as AF
    512e
●   2006 ZFS adds large sector support
●   2009 Advanced Format is approved as naming convention
    for 4K sectors
●   2011 All hard drive manufactures start to ship AF 512e drives
                         Delphix Proprietary and Confidential
Advanced Format Drives
●   Two flavors of Advanced Format Drives
     ○ AF 512e - Advanced Format 512B Emulation

     ○ AF 4Kn - Advanced Format 4K Native




    Today
                                                               Future
                                                              (2012?)

                       Delphix Proprietary and Confidential
Advanced Format 512e (AF 512e)
●   Maps 8 512B logical blocks
    into 1 physical 4K block
●   Provides an emulation layer
    for compatibility


     0   1   2     3     4      5         6         7               512B Logical Blocks

             4K Physical Block #0                                   4K Physical Blocks




                             Delphix Proprietary and Confidential
Predicting Future Problems

"Access on a 512-byte basis would continue to be
supported, but performance would be inferior to
that in which access is done on a 4096-byte basis,
and might well be inferior to that of previous
drives with 512-byte native block size." -- Large
Block Size by Paul Hodges and David Cheng, 1998
Problems in a 4K World
●   Lies
     ○ AF 512e Drive lie about their physical block size

     ○ LUNs from storage vendors lie about their physical block

        size
●   Misaligned I/O
     ○ Proper partitioning

     ○ Some AF 512e drives provide an XP jumper (XP partition

        starts on sector 63, not 4K aligned)
●   Read-modify-write



                        Delphix Proprietary and Confidential
Sub-block Reads

  Read 512B Block



   0    1     2     3     4      5         6         7               512B Logical Blocks

              4K Physical Block #1                                   4K Physical Blocks




   0    1     2     3     4      5         6         7               512B Logical Blocks

              4K Physical Block #1                                   4K Physical Blocks




                                                    Must Read 4K Block


                              Delphix Proprietary and Confidential
Sub-block Writes (Read-modify-write)
Logical Block
 Read 512B




                 0   1   2     3     4          5          6         7          512B Logical Blocks

                         4K Physical Block #0                                   4K Physical Blocks
Physical Block
  Read 4K




                 0   1   2     3     4          5          6         7          512B Logical Blocks

                         4K Physical Block #0                                   4K Physical Blocks
Physical Block
  Write 4K




                 0   1   2     3     4          5          6         7          512B Logical Blocks

                         4K Physical Block #0                                   4K Physical Blocks

                                         Delphix Proprietary and Confidential
Misaligned 4K Writes
Logical Block
  Read 4K




                 0   1   2     3     4          5          6         7          8     9

                         4K Physical Block #0                                       4K Physical Block #1
Physical Block
 Read 2 4K




                 0   1   2     3     4          5          6         7          8     9

                         4K Physical Block #0                                       4K Physical Block #1
Physical Block
 Write 2 4K




                 0   1   2     3     4          5          6         7          8     9

                         4K Physical Block #0                                       4K Physical Block #1

                                         Delphix Proprietary and Confidential
Solutions (sort of)
●   Override the lies from the device
     ○ FreeBSD, Illumos, and Linux have all implemented a way

       to override the discovered sector size
     ○ FreeBSD

         ■ using gnop to create 4k device

     ○ Illumos

         ■ add an override into sd.conf:
            sd-config-list = "VENDOR        PRODUCT", physical-block-size:4096;
    ○   Linux
        ■   zpool create -o ashift=12 tank <device>




                              Delphix Proprietary and Confidential
Drawbacks of 4K and ZFS
●   Reduced compression ratio
     ○ Blocks less than 4K mean 0% compression

     ○ 8K block can only achieve 50% compression

●   Migrating drives from 512B to 4K
●   Inefficient metadata allocation
     ○ Some metadata is allocated in 4K chunks and will no

        longer get compressed
●   Improper accounting of compressed sizes in datasets
●   RAID-Z and 4k -- not recommended
●   Configuring root pools to use 4K
     ○ Grub support?

●   Fewer uberblocks Delphix Proprietary and Confidential
Q&A / Beer?




              Delphix Proprietary and Confidential
ZFS Day
     October 2, 2012
      George Wilson
george.wilson@delphix.com

Contenu connexe

Similaire à Why 4k?

Collaborate vdb performance
Collaborate vdb performanceCollaborate vdb performance
Collaborate vdb performance
Kyle Hailey
 
Oracle Linux Nov 2011 Webcast
Oracle Linux Nov 2011 WebcastOracle Linux Nov 2011 Webcast
Oracle Linux Nov 2011 Webcast
Terry Wang
 
Raid designs in Qsan Storage
Raid designs in Qsan StorageRaid designs in Qsan Storage
Raid designs in Qsan Storage
qsantechnology
 
Using Jenkins as Native Packages Factory - Jenkins User Conference Paris 2012
Using Jenkins as Native Packages Factory - Jenkins User Conference Paris 2012Using Jenkins as Native Packages Factory - Jenkins User Conference Paris 2012
Using Jenkins as Native Packages Factory - Jenkins User Conference Paris 2012
Henri Gomez
 
Sharing experience implementing Direct NFS
Sharing experience implementing Direct NFSSharing experience implementing Direct NFS
Sharing experience implementing Direct NFS
Yury Velikanov
 
Learning Oracle with Oracle VM VirtualBox Whitepaper
Learning Oracle with Oracle VM VirtualBox WhitepaperLearning Oracle with Oracle VM VirtualBox Whitepaper
Learning Oracle with Oracle VM VirtualBox Whitepaper
Leighton Nelson
 

Similaire à Why 4k? (20)

I can\'t believe this is butter - A Tour of btrfs
I can\'t believe this is butter - A Tour of btrfsI can\'t believe this is butter - A Tour of btrfs
I can\'t believe this is butter - A Tour of btrfs
 
DevopsItalia2015 - DHCP at Facebook - Evolution of an infrastructure
DevopsItalia2015 - DHCP at Facebook - Evolution of an infrastructureDevopsItalia2015 - DHCP at Facebook - Evolution of an infrastructure
DevopsItalia2015 - DHCP at Facebook - Evolution of an infrastructure
 
Oracle Database on Docker - Best Practices
Oracle Database on Docker - Best PracticesOracle Database on Docker - Best Practices
Oracle Database on Docker - Best Practices
 
Collaborate vdb performance
Collaborate vdb performanceCollaborate vdb performance
Collaborate vdb performance
 
Dockerizing Oracle Database
Dockerizing Oracle Database Dockerizing Oracle Database
Dockerizing Oracle Database
 
Linux.Conf.AU 2009 (LCA09) Slide "OS Circular: Internet bootable OS Archive" ...
Linux.Conf.AU 2009 (LCA09) Slide "OS Circular: Internet bootable OS Archive" ...Linux.Conf.AU 2009 (LCA09) Slide "OS Circular: Internet bootable OS Archive" ...
Linux.Conf.AU 2009 (LCA09) Slide "OS Circular: Internet bootable OS Archive" ...
 
Oracle Linux Nov 2011 Webcast
Oracle Linux Nov 2011 WebcastOracle Linux Nov 2011 Webcast
Oracle Linux Nov 2011 Webcast
 
Renaissance of sparc UKOUG 2014
Renaissance of sparc UKOUG 2014Renaissance of sparc UKOUG 2014
Renaissance of sparc UKOUG 2014
 
Linux Symposium 2009 Slide Suzaki "Effect of readahead and file system block ...
Linux Symposium 2009 Slide Suzaki "Effect of readahead and file system block ...Linux Symposium 2009 Slide Suzaki "Effect of readahead and file system block ...
Linux Symposium 2009 Slide Suzaki "Effect of readahead and file system block ...
 
Docker for the Rubyist
Docker for the RubyistDocker for the Rubyist
Docker for the Rubyist
 
All your data belong to us - The Active Objects Plugin
All your data belong to us - The Active Objects PluginAll your data belong to us - The Active Objects Plugin
All your data belong to us - The Active Objects Plugin
 
Stardog talk-dc-march-17
Stardog talk-dc-march-17Stardog talk-dc-march-17
Stardog talk-dc-march-17
 
Raid designs in Qsan Storage
Raid designs in Qsan StorageRaid designs in Qsan Storage
Raid designs in Qsan Storage
 
CLFS 2010
CLFS 2010CLFS 2010
CLFS 2010
 
Using Jenkins as Native Packages Factory - Jenkins User Conference Paris 2012
Using Jenkins as Native Packages Factory - Jenkins User Conference Paris 2012Using Jenkins as Native Packages Factory - Jenkins User Conference Paris 2012
Using Jenkins as Native Packages Factory - Jenkins User Conference Paris 2012
 
Sharing experience implementing Direct NFS
Sharing experience implementing Direct NFSSharing experience implementing Direct NFS
Sharing experience implementing Direct NFS
 
Learning Oracle with Oracle VM VirtualBox Whitepaper
Learning Oracle with Oracle VM VirtualBox WhitepaperLearning Oracle with Oracle VM VirtualBox Whitepaper
Learning Oracle with Oracle VM VirtualBox Whitepaper
 
[OpenStack Days Korea 2016] Track1 - All flash CEPH 구성 및 최적화
[OpenStack Days Korea 2016] Track1 - All flash CEPH 구성 및 최적화[OpenStack Days Korea 2016] Track1 - All flash CEPH 구성 및 최적화
[OpenStack Days Korea 2016] Track1 - All flash CEPH 구성 및 최적화
 
logfs
logfslogfs
logfs
 
hibernate
hibernatehibernate
hibernate
 

Dernier

Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 

Dernier (20)

WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 

Why 4k?

  • 1. Why 4K? October 2, 2012 George Wilson george.wilson@delphix.com
  • 2. Why 4K? (Not Y4K) ● This is not the next Millennium bug! Delphix Proprietary and Confidential
  • 3. Storage History ● 1998 IBM publishes a paper proposing an increase of disk sector size from 512B to 4K ● 2000 4K IDEMA (International Disk Drive Equipment and Materials Association) committee was formed ● 2005 ZFS released in OpenSolaris with support for block sizes ranging from 512B to 128K ● 2005 512B emulation mode proposed, later known as AF 512e ● 2006 ZFS adds large sector support ● 2009 Advanced Format is approved as naming convention for 4K sectors ● 2011 All hard drive manufactures start to ship AF 512e drives Delphix Proprietary and Confidential
  • 4. Advanced Format Drives ● Two flavors of Advanced Format Drives ○ AF 512e - Advanced Format 512B Emulation ○ AF 4Kn - Advanced Format 4K Native Today Future (2012?) Delphix Proprietary and Confidential
  • 5. Advanced Format 512e (AF 512e) ● Maps 8 512B logical blocks into 1 physical 4K block ● Provides an emulation layer for compatibility 0 1 2 3 4 5 6 7 512B Logical Blocks 4K Physical Block #0 4K Physical Blocks Delphix Proprietary and Confidential
  • 6. Predicting Future Problems "Access on a 512-byte basis would continue to be supported, but performance would be inferior to that in which access is done on a 4096-byte basis, and might well be inferior to that of previous drives with 512-byte native block size." -- Large Block Size by Paul Hodges and David Cheng, 1998
  • 7. Problems in a 4K World ● Lies ○ AF 512e Drive lie about their physical block size ○ LUNs from storage vendors lie about their physical block size ● Misaligned I/O ○ Proper partitioning ○ Some AF 512e drives provide an XP jumper (XP partition starts on sector 63, not 4K aligned) ● Read-modify-write Delphix Proprietary and Confidential
  • 8. Sub-block Reads Read 512B Block 0 1 2 3 4 5 6 7 512B Logical Blocks 4K Physical Block #1 4K Physical Blocks 0 1 2 3 4 5 6 7 512B Logical Blocks 4K Physical Block #1 4K Physical Blocks Must Read 4K Block Delphix Proprietary and Confidential
  • 9. Sub-block Writes (Read-modify-write) Logical Block Read 512B 0 1 2 3 4 5 6 7 512B Logical Blocks 4K Physical Block #0 4K Physical Blocks Physical Block Read 4K 0 1 2 3 4 5 6 7 512B Logical Blocks 4K Physical Block #0 4K Physical Blocks Physical Block Write 4K 0 1 2 3 4 5 6 7 512B Logical Blocks 4K Physical Block #0 4K Physical Blocks Delphix Proprietary and Confidential
  • 10. Misaligned 4K Writes Logical Block Read 4K 0 1 2 3 4 5 6 7 8 9 4K Physical Block #0 4K Physical Block #1 Physical Block Read 2 4K 0 1 2 3 4 5 6 7 8 9 4K Physical Block #0 4K Physical Block #1 Physical Block Write 2 4K 0 1 2 3 4 5 6 7 8 9 4K Physical Block #0 4K Physical Block #1 Delphix Proprietary and Confidential
  • 11. Solutions (sort of) ● Override the lies from the device ○ FreeBSD, Illumos, and Linux have all implemented a way to override the discovered sector size ○ FreeBSD ■ using gnop to create 4k device ○ Illumos ■ add an override into sd.conf: sd-config-list = "VENDOR PRODUCT", physical-block-size:4096; ○ Linux ■ zpool create -o ashift=12 tank <device> Delphix Proprietary and Confidential
  • 12. Drawbacks of 4K and ZFS ● Reduced compression ratio ○ Blocks less than 4K mean 0% compression ○ 8K block can only achieve 50% compression ● Migrating drives from 512B to 4K ● Inefficient metadata allocation ○ Some metadata is allocated in 4K chunks and will no longer get compressed ● Improper accounting of compressed sizes in datasets ● RAID-Z and 4k -- not recommended ● Configuring root pools to use 4K ○ Grub support? ● Fewer uberblocks Delphix Proprietary and Confidential
  • 13. Q&A / Beer? Delphix Proprietary and Confidential
  • 14. ZFS Day October 2, 2012 George Wilson george.wilson@delphix.com