SlideShare une entreprise Scribd logo
1  sur  61
Reliability Modeling and Analysis of
 Energy-Efficient Storage Systems

                  Shu Yin

            Advisor: Dr. Xiao Qin
     Committee Members: Dr. Sanjeev Baskiyar
                          Dr. Alvin Lim
       University Reader: Dr. Shiwen Mao
Presentation Outline
• Motivation
• MINT Model

• MREED Model

• Models Validation

• Reliability Improvement

• Conclusion and Future Work




                    2
Motivation



Stream Multimedia              Bioinformatic




   3D Graphic                 Weather Forecast

          Data Intensive Applications
                     3
Data Intensive Computing Application




           Cluster System

                 4
Problem: Energy Dissipation




EPA Report to Congress on Server and Data Center Energy Efficiency, 2007



                                     5
Problem:Energy Dissipation(cont.)
                         Using 2010 Historical Trends
                         Scenario
             Disk
                          •   Data Centers consume 110
             Syste            Billion kWh per Year;
               m
              27%         •   Assume Average Commercial
                              End User Is Charged ¢9.46 per
                              kWh
                          •   Disk System Can Account for
                              27% of the Computing Energy
     Other
     73%                      Cost of Data Centers.

                         Disk System May Have
                          An Electrical Cost of
                           2.8 Billion Dollars!

                     6
Existing Energy Conservation Techniques

Software-Directed Power Management
Dynamic Power Management
Redundancy Technique
Multi- speed Setting


How Reliable Are They?


                   7
Contradictory of Energy Efficiency and Reliability




               Energy
             Efficiency

                               Reliability




          Example: Disk Spin Up and Down




                          8
Presentation Outline
•   Motivation
•   MINT Model
• MREED Model
• Models Validation

• Reliability Improvement

• Conclusion and Future Work




                    9
MINT
(MATHEMATICAL RELIABILITY MODELS FOR ENERGY-EFFICIENT PARALLEL DISK SYSTEMS)




                             Energy Conservation
                                 Techniques




                         Single Disk Reliability Model




                        System-Level Reliability Model




                                     10
MINT                         (Single Disk)



                                 Disk Age        Temperature




Frequency     Utilization




            Single Disk Reliability Model



                        Reliability of Single
                                Disk




                             11
MINT                     (Single Disk)




R=α*BaseValue[1]*TemperatureFactor+β*FrequencyAdder[2]




                                           α and β are two coefficients to R

                                        Assumption: α = β = 1 in our research




   [1] E. Pinheiro, W.-D. Weber, and L.A. Barroso. Failure trends in a large disk drive population. Proc.
   USENIX Conf. File and Storage Tech., February2007.
   [2] IDEMA Standards. Specification of hard disk drive reliability.




                                                    12
MINT            (Single Disk)



R=α*BaseValue*TemperatureFactor+β*FrequencyAdder




Utilization Impact on AFR    Temperature Impact on      Transition Frequency Impact on
                               Temperature Factor               Frequency Adder




                                       13
MINT                      (Single Disk)




R=α*BaseValue*TemperatureFactor+β*FrequencyAdder




                                                                                               Frequency=350/Month, T=40°C




                                                                                               Frequency=250/Month, T=40°C

                                                                                               Frequency=350/Month, T=35°C



                                                                                               Frequency=250/Month, T=35°C


                                                                                              Base Value from Google Report[3]




                                        Single Disk Reliability


[3] E. Pinheiro, W.-D. Weber, and L.A. Barroso. Failure trends in a large disk drive population. Proc.
USENIX Conf. File and Storage Tech., February 2007.




                                                    14
MINT                     (Energy Conservation Techniques- PDC)




                                    Popular Date Concentration (PDC)[3]                  - cold data
                                             System Structure
                                                                                         - hot data




[3] E. Pinheiro and R. Bianchini. Energy conservation techniques for disk array-based servers. Int’l Conf.
on Supercomputing, pages 68–78, June 2004.




                                                   15
MINT                 (Energy Conservation Techniques- PDC)




Access Rate<MIN(Access Rate)       Access Rate>MAX(Access Rate)




                                                                                                 Access Rate<MIN(Access Rate)
                      More Popular Disk                                               Less Popular Disk
                                                                  Access Rate>MAX(Access Rate)




                                                                                        - cold data
                                                                                        - hot data




                                                          16
MINT   (Energy Conservation Techniques- PDC)




       Popular Date Concentration (PDC)[3]        - cold data
                System Structure
                                                  - hot data
       (Optimal Result for Certain Time Phases)




                       17
MINT                  (Energy Conservation Techniques- MAID)




                               Massive Array of Idle Disks (MAID)[4]          - cold data
                                       System Structure
                                                                              - hot data




[4] Dennis Colarelli and Dirk Grunwald. Massive arrays of idle disks for storage archives.
Supercomputing ’02: Proceedings of the 2002 ACM/IEEE conference on Supercomputing, pages 1–11,
Los Alamitos, CA, USA, 2002. IEEE Computer Society Press.




                                               18
MINT                  (Energy Conservation Techniques- MAID)




                        Cache Disk                             Data Disk


                                                                       Access Rate>MAX(Access Rate)




                               Massive Array of Idle Disks (MAID)[4]               - cold data
                                       System Structure
                                                                                   - hot data




[4] Dennis Colarelli and Dirk Grunwald. Massive arrays of idle disks for storage archives.
Supercomputing ’02: Proceedings of the 2002 ACM/IEEE conference on Supercomputing, pages 1–11,
Los Alamitos, CA, USA, 2002. IEEE Computer Society Press.




                                               19
MINT                         (System-Level)


            Access
                                Disk Age          Temperature
            Pattern




                Energy Conservation
                    Techniques



Frequency        Utilization         Frequency        Utilization




            Single Disk Reliability Model


      Reliability of                          Reliability of
        Disk 1                                  Disk n



      System-Level Reliability Model




                          Reliability of A
                       Parallel Disk System




                                20
Preliminary Results                               (experimental setting)




 Energy-efficiency                        File Access Rate          File Size
                   Number of Disks
     Scheme                               (No. per month)              (KB)
                       20 data
       PDC                                     0~106                  300
                     (20 in total)
                  15 data + 5 cache
     MAID-1                                    0~106                  300
                      (20 in total)

                  20 data + 5 cache
     MAID-2                                    0~106                  300
                      (25 in total)



Read-only Disks



                                     21
Preliminary Result
Comparison Between PDC and MAID




             AFR Comparison of PDC and MAID
          Access Rate(*104) Impacts on AFR (T=35°C)




                             22
Preliminary Result
Comparison Between PDC and MAID




     AFR Comparison of PDC and MAID
  Access Rate(*104) Impacts on AFR (T=35°C)
          - PDC        - MAID



                                              23
MAID under High Access Rate



               MAID-1




               MAID-2




           AFR Comparison of PDC and MAID
        Access Rate(*104) Impacts on AFR (T=35°C)




                           24
MAID under High Access Rate


                                                 MAID-1




                                                 MAID-2




                        MAID-1




                        MAID-2
                                                 MAID-1




                                                 MAID-2




   AFR Comparison of PDC and MAID
Access Rate(*104) Impacts on AFR (T=35°C)




                                            25
MINT      (conclusion)




Mathematical Model for Disk Systems
MINT Study on PDC and MAID
But ...

                       Data Stripping Mechanism
                              Energy Consumption Issues
       What about RAID?       Reliability Issues
                              Complexity




                  26
Presentation Outline
• Motivation
• MINT Model


•   MREED Model
• Models Validation
• Reliability Improvement

• Conclusion and Future Work




                   27
MREED Model
(MATHEMATICAL RELIABILITY MODELS FOR ENERGY-EFFICIENT RAID SYSTEMS)



                        Access Pattern                         Temperature



                  Energy Conservation Techniques


                                              Utilization




                Frequency            Weibull Analysis




                                         Annual Failure Rate




                                         28
Weibull Analysis
A Leading Method for Fitting Life Date
Advantages:
 Accurate
 Small Samples
 Widely Used




                   29
MREED Model
                   (Energy Conservation Techniques- PARAID)




   Soft
  State

  RAID

 Gears         1
               2
               3




                                    Power-Aware RAID (PA-RAID)[5]
                                          System Structure




[5] Charles Weddle, Mathew Oldhan, Jin Qian, An-I Andy Wang.PARAID: A Gear-Shifting Power-Aware RAID.
USENIX FAST 2007.



                                                 30
Reliability Evaluation(Experiment Setup)



        Disk Type                               Seagate ST3146855FC
         Capacity                                      146 GB
       Cache Size                                     Sata 16MB

Buffer to Host Transfer Rate                         4Gb/s (Max)

  Total Number of Disks                                    5

         File Size                                      100 MB

     Number of Files                                     1000

     Synthetic Trace                              Poisson Distribution

       Time Period                                     24 Hours
Interval Time (Time Phase)                              1 Hour

 Power on Hour Per Year                               8760 Hours




                                  31
Reliability Evaluation
                            (Disk Utilization Comparison)




Disk Utilization Comparison Between PARAID-0 and RAID-0 at A Low Access Rate (20/hr)



                                        32
Reliability Evaluation
                             (Disk Utilization Comparison)




Disk Utilization Comparison Between PARAID-0 and RAID-0 at A High Access Rate (80/hr)



                                         33
Reliability Evaluation (AFR Comparison)




AFR Comparison Between PARAID-0 and RAID-0 at A Low Access Rate (20/hr)



                                  34
Reliability Evaluation (AFR Comparison)




        AFR




AFR Comparison Between PARAID-0 and RAID-0 at A High Access Rate (80/hr)



                                  35
Presentation Outline
• Motivation
• MINT Model

• MREED Model


•   Models Validation
• Reliability Improvement
• Conclusion and Future Work




                   36
Model Validation
Techniques
–   Run the Systems for A Couple of Decades



–   The Event Validity Validation Techniques[6]




         [6] R.G. Sargent, “Verification and Validation of Simulation Models”, in Proceedings of the 37th conference on
         Winter Simulation, ser. WSC’05 Winter Simulation Conference, 2005.



                                                               37
Model Validation
Challenges
 Unable to Monitor PARAID Running for Years



 Sample Size is Small from A Validation
 Perspective (e.g. 100 Disks for Five Years)




                       38
Model Validation                                                                    (DiskSim[7] Simulation)




                                File To Block Level Converter
 [7] S.W.S John, S. Bucy, Jiri Schindler and G.R. Ganger, “The DiskSim Simulation Environment Version 4.0
 Reference Manual”, 2008


                                                     39
Model Validation                                  (DiskSim Simulation)




 Diagram of the Storage System Corresponding to the DiskSim RAID-0



                              40
Model Validation                                    (Result)




Utilization Comparison Between MREED and DiskSim Simulator



                           41
Model Validation                                     (Result)




Gear Shifting Comparison Between MREED and DiskSim Simulator



                            42
Presentation Outline
• Motivation
• MINT Model

• MREED Model

• Models Validation


•   Reliability Improvement
•   Conclusion and Future Work


                      43
Recall PDC




  Popular Date Concentration (PDC)          - cold data
          System Structure
                                            - hot data
 (Optimal Result for Certain Time Phases)




                 44
Problem of PDC
 The Most Popular Disk:
High AFR
No Replica




                    45
Reliability Improvement of PDC
Method of Improving Reliability
 Mirroring
Extra Disks for Replication -> More Energy Consumption

 Disk Swapping
Swap Existing Disks




                           46
Disk Swapping Scheme
                        PDC




Swap the Most Popular Disk with the Least Popular Disk


                          47
Disk Swapping Scheme
                      PDC




Swap the Highest AFR Disk with the Lowest AFR Disk


                        48
Disk Swapping Scheme
                 MAID




 Swap the Cache Disks with the Data Disks


                    49
Preliminary Results                               (experimental setting)




     Energy-efficiency                        File Access Rate          File Size
                       Number of Disks
         Scheme                               (No. per month)              (KB)
                           20 data
           PDC                                     0~106                  300
                         (20 in total)
                      15 data + 5 cache
         MAID-1                                    0~106                  300
                          (20 in total)

                      20 data + 5 cache
         MAID-2                                    0~106                  300
                          (25 in total)

•   Read-only Disks
•   Mean Time to Data Lose (MTTDL)
•   Swapping Thresholds (2*105, 5*105, 8*105 No./Month)
•   Single Swapping
                                         50
Comparison of Disk Swap
                    PDC




             AFR Comparison of PDC
    Access Rate(*104) Impacts on AFR (T=35°C)
           Threshold = 2*105 No./Month
                       51
Comparison of Disk Swap
                           PDC
 AFR:
 Swap2 < Swap1 < No Swap




                   AFR Comparison of PDC
          Access Rate(*104) Impacts on AFR (T=35°C)
                 Threshold = 2*105 No./Month
                             52
Comparison Between Different Threshold
                          PDC




                   AFR Comparison of PDC
          Access Rate(*104) Impacts on AFR (T=35°C)
                 Threshold = 2*105 No./Month
                             53
Comparison Between Different Threshold
                          PDC




                   AFR Comparison of PDC
          Access Rate(*104) Impacts on AFR (T=35°C)
                 Threshold = 5*105 No./Month
                             54
Comparison Between Different Threshold
                          PDC




                   AFR Comparison of PDC
          Access Rate(*104) Impacts on AFR (T=35°C)
                 Threshold = 8*105 No./Month
                             55
Comparison Between Different Threshold
                                    PDC


AFR
Higher Threshold -> Lower AFR




                              AFR Comparison of PDC
                    Access Rate(*104) Impacts on AFR (T=35°C)
          Threshold = 2*105 No./Month, 5*105 No./Month, 8*105 No./Month
                                       56
Limitations
•   Read Only Disk Scenario

•   Data Migration within Certain Time Phases

•   Simple File Access Patterns




                        57
Future Work
 Extend the Models to investigate mixed read/write
 workloads;
 Research the trade-offs between reliability and energy-
 efficiency;
 Extend schemes to a real-world based environment;
 Develop a multi-swapping mechanism
balancing the utilization & lowering the failure rate;
 Evaluate more control groups.




                                58
Conclusion
•   Generic Models coupled with power
    management optimization policies;
•   Two reliability models for the three well-known
    energy-saving schemes -- PDC, MAID and PARAID;
•   Disk swapping strategies to improve disk
    reliability for PDC.




                         59
Thanks
Questions?

Contenu connexe

Similaire à Reliability Modeling and Analysis of Energy-Efficient Storage Systems

Simple regenerating codes: Network Coding for Cloud Storage
Simple regenerating codes: Network Coding for Cloud StorageSimple regenerating codes: Network Coding for Cloud Storage
Simple regenerating codes: Network Coding for Cloud Storage
Kevin Tong
 
QPACE - QCD Parallel Computing on the Cell Broadband Engine™ (Cell/B.E.)
QPACE - QCD Parallel Computing on the Cell Broadband Engine™ (Cell/B.E.)QPACE - QCD Parallel Computing on the Cell Broadband Engine™ (Cell/B.E.)
QPACE - QCD Parallel Computing on the Cell Broadband Engine™ (Cell/B.E.)
Heiko Joerg Schick
 
Backup netezza-tsm-v1403c-140330170451-phpapp01
Backup netezza-tsm-v1403c-140330170451-phpapp01Backup netezza-tsm-v1403c-140330170451-phpapp01
Backup netezza-tsm-v1403c-140330170451-phpapp01
Arunkumar Shanmugam
 
Caqa5e ch1 with_review_and_examples
Caqa5e ch1 with_review_and_examplesCaqa5e ch1 with_review_and_examples
Caqa5e ch1 with_review_and_examples
Aravindharamanan S
 
What is the future of disk drives?
What is the future of disk drives?What is the future of disk drives?
What is the future of disk drives?
Iftikhar Alam
 

Similaire à Reliability Modeling and Analysis of Energy-Efficient Storage Systems (20)

Reliability Analysis for an Energy-Aware RAID System
Reliability Analysis for an Energy-Aware RAID SystemReliability Analysis for an Energy-Aware RAID System
Reliability Analysis for an Energy-Aware RAID System
 
Green IT in the boardroom, Jose Iglesias Symantec
Green IT in the boardroom, Jose Iglesias SymantecGreen IT in the boardroom, Jose Iglesias Symantec
Green IT in the boardroom, Jose Iglesias Symantec
 
Feeding the Multicore Beast:It’s All About the Data!
Feeding the Multicore Beast:It’s All About the Data!Feeding the Multicore Beast:It’s All About the Data!
Feeding the Multicore Beast:It’s All About the Data!
 
CAQA5e_ch1 (3).pptx
CAQA5e_ch1 (3).pptxCAQA5e_ch1 (3).pptx
CAQA5e_ch1 (3).pptx
 
Chapter 1.pptx
Chapter 1.pptxChapter 1.pptx
Chapter 1.pptx
 
Simple regenerating codes: Network Coding for Cloud Storage
Simple regenerating codes: Network Coding for Cloud StorageSimple regenerating codes: Network Coding for Cloud Storage
Simple regenerating codes: Network Coding for Cloud Storage
 
Webinar: How Snapshots CAN be Backups
Webinar: How Snapshots CAN be BackupsWebinar: How Snapshots CAN be Backups
Webinar: How Snapshots CAN be Backups
 
QPACE - QCD Parallel Computing on the Cell Broadband Engine™ (Cell/B.E.)
QPACE - QCD Parallel Computing on the Cell Broadband Engine™ (Cell/B.E.)QPACE - QCD Parallel Computing on the Cell Broadband Engine™ (Cell/B.E.)
QPACE - QCD Parallel Computing on the Cell Broadband Engine™ (Cell/B.E.)
 
Distribute Storage System May-2014
Distribute Storage System May-2014Distribute Storage System May-2014
Distribute Storage System May-2014
 
Ibm pure data system for analytics n200x
Ibm pure data system for analytics n200xIbm pure data system for analytics n200x
Ibm pure data system for analytics n200x
 
Software Faults, Failures and Their Mitigations | Turing100@Persistent
Software Faults, Failures and Their Mitigations | Turing100@PersistentSoftware Faults, Failures and Their Mitigations | Turing100@Persistent
Software Faults, Failures and Their Mitigations | Turing100@Persistent
 
Tugas pp
Tugas ppTugas pp
Tugas pp
 
Chip Multiprocessing and the Cell Broadband Engine.pdf
Chip Multiprocessing and the Cell Broadband Engine.pdfChip Multiprocessing and the Cell Broadband Engine.pdf
Chip Multiprocessing and the Cell Broadband Engine.pdf
 
Michael Gschwind, Chip Multiprocessing and the Cell Broadband Engine
Michael Gschwind, Chip Multiprocessing and the Cell Broadband EngineMichael Gschwind, Chip Multiprocessing and the Cell Broadband Engine
Michael Gschwind, Chip Multiprocessing and the Cell Broadband Engine
 
Backup netezza-tsm-v1403c-140330170451-phpapp01
Backup netezza-tsm-v1403c-140330170451-phpapp01Backup netezza-tsm-v1403c-140330170451-phpapp01
Backup netezza-tsm-v1403c-140330170451-phpapp01
 
Umit hw6
Umit hw6Umit hw6
Umit hw6
 
The Cloud & Its Impact on IT
The Cloud & Its Impact on ITThe Cloud & Its Impact on IT
The Cloud & Its Impact on IT
 
Caqa5e ch1 with_review_and_examples
Caqa5e ch1 with_review_and_examplesCaqa5e ch1 with_review_and_examples
Caqa5e ch1 with_review_and_examples
 
What is the future of disk drives?
What is the future of disk drives?What is the future of disk drives?
What is the future of disk drives?
 
KIISE:SIGDB Workshop presentation.
KIISE:SIGDB Workshop presentation.KIISE:SIGDB Workshop presentation.
KIISE:SIGDB Workshop presentation.
 

Plus de Xiao Qin

P#1 stream of praise
P#1 stream of praiseP#1 stream of praise
P#1 stream of praise
Xiao Qin
 

Plus de Xiao Qin (20)

How to apply for internship positions?
How to apply for internship positions?How to apply for internship positions?
How to apply for internship positions?
 
How to write research papers? Version 5.0
How to write research papers? Version 5.0How to write research papers? Version 5.0
How to write research papers? Version 5.0
 
Making a competitive nsf career proposal: Part 2 Worksheet
Making a competitive nsf career proposal: Part 2 WorksheetMaking a competitive nsf career proposal: Part 2 Worksheet
Making a competitive nsf career proposal: Part 2 Worksheet
 
Making a competitive nsf career proposal: Part 1 Tips
Making a competitive nsf career proposal: Part 1 TipsMaking a competitive nsf career proposal: Part 1 Tips
Making a competitive nsf career proposal: Part 1 Tips
 
Auburn csse faculty orientation
Auburn csse faculty orientationAuburn csse faculty orientation
Auburn csse faculty orientation
 
Auburn CSSE graduate student orientation
Auburn CSSE graduate student orientationAuburn CSSE graduate student orientation
Auburn CSSE graduate student orientation
 
CSSE Graduate Programs Committee: Progress Report
CSSE Graduate Programs Committee: Progress ReportCSSE Graduate Programs Committee: Progress Report
CSSE Graduate Programs Committee: Progress Report
 
Project 2 How to modify os161: A Manual
Project 2 How to modify os161: A ManualProject 2 How to modify os161: A Manual
Project 2 How to modify os161: A Manual
 
Project 2 how to modify OS/161
Project 2 how to modify OS/161Project 2 how to modify OS/161
Project 2 how to modify OS/161
 
Project 2 how to install and compile os161
Project 2 how to install and compile os161Project 2 how to install and compile os161
Project 2 how to install and compile os161
 
Project 2 - how to compile os161?
Project 2 - how to compile os161?Project 2 - how to compile os161?
Project 2 - how to compile os161?
 
Understanding what our customer wants-slideshare
Understanding what our customer wants-slideshareUnderstanding what our customer wants-slideshare
Understanding what our customer wants-slideshare
 
OS/161 Overview
OS/161 OverviewOS/161 Overview
OS/161 Overview
 
Surviving a group project
Surviving a group projectSurviving a group project
Surviving a group project
 
P#1 stream of praise
P#1 stream of praiseP#1 stream of praise
P#1 stream of praise
 
Data center specific thermal and energy saving techniques
Data center specific thermal and energy saving techniquesData center specific thermal and energy saving techniques
Data center specific thermal and energy saving techniques
 
How to do research?
How to do research?How to do research?
How to do research?
 
COMP2710 Software Construction: header files
COMP2710 Software Construction: header filesCOMP2710 Software Construction: header files
COMP2710 Software Construction: header files
 
COMP2710: Software Construction - Linked list exercises
COMP2710: Software Construction - Linked list exercisesCOMP2710: Software Construction - Linked list exercises
COMP2710: Software Construction - Linked list exercises
 
HDFS-HC2: Analysis of Data Placement Strategy based on Computing Power of Nod...
HDFS-HC2: Analysis of Data Placement Strategy based on Computing Power of Nod...HDFS-HC2: Analysis of Data Placement Strategy based on Computing Power of Nod...
HDFS-HC2: Analysis of Data Placement Strategy based on Computing Power of Nod...
 

Dernier

Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Victor Rentea
 

Dernier (20)

TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
Emergent Methods: Multi-lingual narrative tracking in the news - real-time ex...
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 

Reliability Modeling and Analysis of Energy-Efficient Storage Systems

  • 1. Reliability Modeling and Analysis of Energy-Efficient Storage Systems Shu Yin Advisor: Dr. Xiao Qin Committee Members: Dr. Sanjeev Baskiyar Dr. Alvin Lim University Reader: Dr. Shiwen Mao
  • 2. Presentation Outline • Motivation • MINT Model • MREED Model • Models Validation • Reliability Improvement • Conclusion and Future Work 2
  • 3. Motivation Stream Multimedia Bioinformatic 3D Graphic Weather Forecast Data Intensive Applications 3
  • 4. Data Intensive Computing Application Cluster System 4
  • 5. Problem: Energy Dissipation EPA Report to Congress on Server and Data Center Energy Efficiency, 2007 5
  • 6. Problem:Energy Dissipation(cont.) Using 2010 Historical Trends Scenario Disk • Data Centers consume 110 Syste Billion kWh per Year; m 27% • Assume Average Commercial End User Is Charged ¢9.46 per kWh • Disk System Can Account for 27% of the Computing Energy Other 73% Cost of Data Centers. Disk System May Have An Electrical Cost of 2.8 Billion Dollars! 6
  • 7. Existing Energy Conservation Techniques Software-Directed Power Management Dynamic Power Management Redundancy Technique Multi- speed Setting How Reliable Are They? 7
  • 8. Contradictory of Energy Efficiency and Reliability Energy Efficiency Reliability Example: Disk Spin Up and Down 8
  • 9. Presentation Outline • Motivation • MINT Model • MREED Model • Models Validation • Reliability Improvement • Conclusion and Future Work 9
  • 10. MINT (MATHEMATICAL RELIABILITY MODELS FOR ENERGY-EFFICIENT PARALLEL DISK SYSTEMS) Energy Conservation Techniques Single Disk Reliability Model System-Level Reliability Model 10
  • 11. MINT (Single Disk) Disk Age Temperature Frequency Utilization Single Disk Reliability Model Reliability of Single Disk 11
  • 12. MINT (Single Disk) R=α*BaseValue[1]*TemperatureFactor+β*FrequencyAdder[2] α and β are two coefficients to R Assumption: α = β = 1 in our research [1] E. Pinheiro, W.-D. Weber, and L.A. Barroso. Failure trends in a large disk drive population. Proc. USENIX Conf. File and Storage Tech., February2007. [2] IDEMA Standards. Specification of hard disk drive reliability. 12
  • 13. MINT (Single Disk) R=α*BaseValue*TemperatureFactor+β*FrequencyAdder Utilization Impact on AFR Temperature Impact on Transition Frequency Impact on Temperature Factor Frequency Adder 13
  • 14. MINT (Single Disk) R=α*BaseValue*TemperatureFactor+β*FrequencyAdder Frequency=350/Month, T=40°C Frequency=250/Month, T=40°C Frequency=350/Month, T=35°C Frequency=250/Month, T=35°C Base Value from Google Report[3] Single Disk Reliability [3] E. Pinheiro, W.-D. Weber, and L.A. Barroso. Failure trends in a large disk drive population. Proc. USENIX Conf. File and Storage Tech., February 2007. 14
  • 15. MINT (Energy Conservation Techniques- PDC) Popular Date Concentration (PDC)[3] - cold data System Structure - hot data [3] E. Pinheiro and R. Bianchini. Energy conservation techniques for disk array-based servers. Int’l Conf. on Supercomputing, pages 68–78, June 2004. 15
  • 16. MINT (Energy Conservation Techniques- PDC) Access Rate<MIN(Access Rate) Access Rate>MAX(Access Rate) Access Rate<MIN(Access Rate) More Popular Disk Less Popular Disk Access Rate>MAX(Access Rate) - cold data - hot data 16
  • 17. MINT (Energy Conservation Techniques- PDC) Popular Date Concentration (PDC)[3] - cold data System Structure - hot data (Optimal Result for Certain Time Phases) 17
  • 18. MINT (Energy Conservation Techniques- MAID) Massive Array of Idle Disks (MAID)[4] - cold data System Structure - hot data [4] Dennis Colarelli and Dirk Grunwald. Massive arrays of idle disks for storage archives. Supercomputing ’02: Proceedings of the 2002 ACM/IEEE conference on Supercomputing, pages 1–11, Los Alamitos, CA, USA, 2002. IEEE Computer Society Press. 18
  • 19. MINT (Energy Conservation Techniques- MAID) Cache Disk Data Disk Access Rate>MAX(Access Rate) Massive Array of Idle Disks (MAID)[4] - cold data System Structure - hot data [4] Dennis Colarelli and Dirk Grunwald. Massive arrays of idle disks for storage archives. Supercomputing ’02: Proceedings of the 2002 ACM/IEEE conference on Supercomputing, pages 1–11, Los Alamitos, CA, USA, 2002. IEEE Computer Society Press. 19
  • 20. MINT (System-Level) Access Disk Age Temperature Pattern Energy Conservation Techniques Frequency Utilization Frequency Utilization Single Disk Reliability Model Reliability of Reliability of Disk 1 Disk n System-Level Reliability Model Reliability of A Parallel Disk System 20
  • 21. Preliminary Results (experimental setting) Energy-efficiency File Access Rate File Size Number of Disks Scheme (No. per month) (KB) 20 data PDC 0~106 300 (20 in total) 15 data + 5 cache MAID-1 0~106 300 (20 in total) 20 data + 5 cache MAID-2 0~106 300 (25 in total) Read-only Disks 21
  • 22. Preliminary Result Comparison Between PDC and MAID AFR Comparison of PDC and MAID Access Rate(*104) Impacts on AFR (T=35°C) 22
  • 23. Preliminary Result Comparison Between PDC and MAID AFR Comparison of PDC and MAID Access Rate(*104) Impacts on AFR (T=35°C) - PDC - MAID 23
  • 24. MAID under High Access Rate MAID-1 MAID-2 AFR Comparison of PDC and MAID Access Rate(*104) Impacts on AFR (T=35°C) 24
  • 25. MAID under High Access Rate MAID-1 MAID-2 MAID-1 MAID-2 MAID-1 MAID-2 AFR Comparison of PDC and MAID Access Rate(*104) Impacts on AFR (T=35°C) 25
  • 26. MINT (conclusion) Mathematical Model for Disk Systems MINT Study on PDC and MAID But ... Data Stripping Mechanism Energy Consumption Issues What about RAID? Reliability Issues Complexity 26
  • 27. Presentation Outline • Motivation • MINT Model • MREED Model • Models Validation • Reliability Improvement • Conclusion and Future Work 27
  • 28. MREED Model (MATHEMATICAL RELIABILITY MODELS FOR ENERGY-EFFICIENT RAID SYSTEMS) Access Pattern Temperature Energy Conservation Techniques Utilization Frequency Weibull Analysis Annual Failure Rate 28
  • 29. Weibull Analysis A Leading Method for Fitting Life Date Advantages: Accurate Small Samples Widely Used 29
  • 30. MREED Model (Energy Conservation Techniques- PARAID) Soft State RAID Gears 1 2 3 Power-Aware RAID (PA-RAID)[5] System Structure [5] Charles Weddle, Mathew Oldhan, Jin Qian, An-I Andy Wang.PARAID: A Gear-Shifting Power-Aware RAID. USENIX FAST 2007. 30
  • 31. Reliability Evaluation(Experiment Setup) Disk Type Seagate ST3146855FC Capacity 146 GB Cache Size Sata 16MB Buffer to Host Transfer Rate 4Gb/s (Max) Total Number of Disks 5 File Size 100 MB Number of Files 1000 Synthetic Trace Poisson Distribution Time Period 24 Hours Interval Time (Time Phase) 1 Hour Power on Hour Per Year 8760 Hours 31
  • 32. Reliability Evaluation (Disk Utilization Comparison) Disk Utilization Comparison Between PARAID-0 and RAID-0 at A Low Access Rate (20/hr) 32
  • 33. Reliability Evaluation (Disk Utilization Comparison) Disk Utilization Comparison Between PARAID-0 and RAID-0 at A High Access Rate (80/hr) 33
  • 34. Reliability Evaluation (AFR Comparison) AFR Comparison Between PARAID-0 and RAID-0 at A Low Access Rate (20/hr) 34
  • 35. Reliability Evaluation (AFR Comparison) AFR AFR Comparison Between PARAID-0 and RAID-0 at A High Access Rate (80/hr) 35
  • 36. Presentation Outline • Motivation • MINT Model • MREED Model • Models Validation • Reliability Improvement • Conclusion and Future Work 36
  • 37. Model Validation Techniques – Run the Systems for A Couple of Decades – The Event Validity Validation Techniques[6] [6] R.G. Sargent, “Verification and Validation of Simulation Models”, in Proceedings of the 37th conference on Winter Simulation, ser. WSC’05 Winter Simulation Conference, 2005. 37
  • 38. Model Validation Challenges Unable to Monitor PARAID Running for Years Sample Size is Small from A Validation Perspective (e.g. 100 Disks for Five Years) 38
  • 39. Model Validation (DiskSim[7] Simulation) File To Block Level Converter [7] S.W.S John, S. Bucy, Jiri Schindler and G.R. Ganger, “The DiskSim Simulation Environment Version 4.0 Reference Manual”, 2008 39
  • 40. Model Validation (DiskSim Simulation) Diagram of the Storage System Corresponding to the DiskSim RAID-0 40
  • 41. Model Validation (Result) Utilization Comparison Between MREED and DiskSim Simulator 41
  • 42. Model Validation (Result) Gear Shifting Comparison Between MREED and DiskSim Simulator 42
  • 43. Presentation Outline • Motivation • MINT Model • MREED Model • Models Validation • Reliability Improvement • Conclusion and Future Work 43
  • 44. Recall PDC Popular Date Concentration (PDC) - cold data System Structure - hot data (Optimal Result for Certain Time Phases) 44
  • 45. Problem of PDC The Most Popular Disk: High AFR No Replica 45
  • 46. Reliability Improvement of PDC Method of Improving Reliability Mirroring Extra Disks for Replication -> More Energy Consumption Disk Swapping Swap Existing Disks 46
  • 47. Disk Swapping Scheme PDC Swap the Most Popular Disk with the Least Popular Disk 47
  • 48. Disk Swapping Scheme PDC Swap the Highest AFR Disk with the Lowest AFR Disk 48
  • 49. Disk Swapping Scheme MAID Swap the Cache Disks with the Data Disks 49
  • 50. Preliminary Results (experimental setting) Energy-efficiency File Access Rate File Size Number of Disks Scheme (No. per month) (KB) 20 data PDC 0~106 300 (20 in total) 15 data + 5 cache MAID-1 0~106 300 (20 in total) 20 data + 5 cache MAID-2 0~106 300 (25 in total) • Read-only Disks • Mean Time to Data Lose (MTTDL) • Swapping Thresholds (2*105, 5*105, 8*105 No./Month) • Single Swapping 50
  • 51. Comparison of Disk Swap PDC AFR Comparison of PDC Access Rate(*104) Impacts on AFR (T=35°C) Threshold = 2*105 No./Month 51
  • 52. Comparison of Disk Swap PDC AFR: Swap2 < Swap1 < No Swap AFR Comparison of PDC Access Rate(*104) Impacts on AFR (T=35°C) Threshold = 2*105 No./Month 52
  • 53. Comparison Between Different Threshold PDC AFR Comparison of PDC Access Rate(*104) Impacts on AFR (T=35°C) Threshold = 2*105 No./Month 53
  • 54. Comparison Between Different Threshold PDC AFR Comparison of PDC Access Rate(*104) Impacts on AFR (T=35°C) Threshold = 5*105 No./Month 54
  • 55. Comparison Between Different Threshold PDC AFR Comparison of PDC Access Rate(*104) Impacts on AFR (T=35°C) Threshold = 8*105 No./Month 55
  • 56. Comparison Between Different Threshold PDC AFR Higher Threshold -> Lower AFR AFR Comparison of PDC Access Rate(*104) Impacts on AFR (T=35°C) Threshold = 2*105 No./Month, 5*105 No./Month, 8*105 No./Month 56
  • 57. Limitations • Read Only Disk Scenario • Data Migration within Certain Time Phases • Simple File Access Patterns 57
  • 58. Future Work Extend the Models to investigate mixed read/write workloads; Research the trade-offs between reliability and energy- efficiency; Extend schemes to a real-world based environment; Develop a multi-swapping mechanism balancing the utilization & lowering the failure rate; Evaluate more control groups. 58
  • 59. Conclusion • Generic Models coupled with power management optimization policies; • Two reliability models for the three well-known energy-saving schemes -- PDC, MAID and PARAID; • Disk swapping strategies to improve disk reliability for PDC. 59

Notes de l'éditeur

  1. Hot data--the data has increasing access rate Cold data--the data hasn’t been accessed for a while-- has decreasing access rate Most popular disk-- stores most of the hot data Least popular disk-- stores most of the cold data
  2. Cache Disks only handle the data copied in while Data Disks only handle the data copied out
  3. many of the applications are read-only eg. server system like youtube.com, more downloading than uploading
  4. Utilization Sensitivity: PDC is higher than MAID; AFR: when the access rate is low, PDC is lower than MAID the reason is on the following slide
  5. from access rate-utilization model, the higher the access rate is, the higher the utilization of the system will be, however from the utilization-AFR model,
  6. The AFR of MAID will NOT keep decreasing At high access rate, AFR of MAID-1 is higher than that of MAID-2 the reason is on the following slide
  7. Workload: 1, file accessing 2, file movement 3, parity data (for level 5)
  8. problems that needs to be solved
  9. problems that needs to be solved
  10. problems that needs to be solved
  11. problems that needs to be solved
  12. many of the applications are read-only eg. server system like youtube.com, more downloading than uploading
  13. Utilization Sensitivity: PDC is higher than MAID; AFR: when the access rate is low, PDC is lower than MAID the reason is on the following slide
  14. Utilization Sensitivity: PDC is higher than MAID; AFR: when the access rate is low, PDC is lower than MAID the reason is on the following slide
  15. Utilization Sensitivity: PDC is higher than MAID; AFR: when the access rate is low, PDC is lower than MAID the reason is on the following slide
  16. Utilization Sensitivity: PDC is higher than MAID; AFR: when the access rate is low, PDC is lower than MAID the reason is on the following slide
  17. Utilization Sensitivity: PDC is higher than MAID; AFR: when the access rate is low, PDC is lower than MAID the reason is on the following slide
  18. Utilization Sensitivity: PDC is higher than MAID; AFR: when the access rate is low, PDC is lower than MAID the reason is on the following slide
  19. why use benchmark to validate: unpractical to run in the real-life; sample size is still small why only validate access rate-utilization model: utilization-AFR base on maintenance data from report
  20. why modeling reliability of PARAID? RAID with energy saving scheme failure of disk will have more problems: total data lose(raid 0) or more time and energy consuming for data recovering(raid 5)
  21. why not improving reliability of PARAID? Reliability of RAID with energy saving scheme is still under research Level 0, hardly no improving space Level 1, need to find the balance point for energy vs. Reliability Level 5, complexity, performance, energy, and reliability