With the rapid growth of the production and storage of large scale data sets it is important to investigate methods to drive the cost of storage systems down. Many
energy conservation techniques have been proposed to achieve high energy efficiency
in disk systems. Unfortunately, growing evidence shows that energy-saving schemes in disk drives usually have negative impacts on storage systems. Existing reliability models are inadequate to estimate reliability of parallel disk systems equipped with energy conservation techniques. To solve this problem, we firstly propose a mathematical model - called MINT - to evaluate the reliability of a parallel disk system where energy-saving mechanisms are implemented. In this dissertation, MINT is focused on modeling the reliability impacts of two well-known energy-saving techniques - the Popular Disk Concentration technique (PDC) and the Massive Array of Idle Disks (MAID). Different from MAID and PDC which store a complete file on the same disk, the Redundancy Array of Inexpensive Disks (RAID) stripes file into several parts and stores them on different disks to ensure higher parallelism, hence higher I/O performance. However, RAID faces more challenges on energy efficiency
and reliability issues. In order to evaluate the reliability of power-aware RAID, we
then develop a Weibull-based model–MREED. In this dissertation, we use MREED to model the reliability impacts of a famous energy efficiency storage mechanism– the Power-Aware RAID (PARAID). Thirdly, we focus on validation of two models–MINT and MREED. It is challenging to validate the accuracy of reliability models, since we are unable to watch certain energy-efficiency systems for a couple of decades due to its time consuming and experimental costs. We introduce validated storage system
simulator–DiskSim–to determine if our model and DiskSim agree with one another. In our validation process, we compare a file access trace in a real-world file system. Last part of of this dissertation focuses on improvement of energy-efficient parallel storage systems. We propose a strategy–Disk Swapping–to improve disk reliability by alternating disks storing data that is frequently accessed with disks holding less accessed data. In this part, we focus on studying reliability improvement of PDC and MAID. At last, we further improve disk reliability by introducing multiple disk
swapping strategy.
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Reliability Modeling and Analysis of Energy-Efficient Storage Systems
1. Reliability Modeling and Analysis of
Energy-Efficient Storage Systems
Shu Yin
Advisor: Dr. Xiao Qin
Committee Members: Dr. Sanjeev Baskiyar
Dr. Alvin Lim
University Reader: Dr. Shiwen Mao
6. Problem:Energy Dissipation(cont.)
Using 2010 Historical Trends
Scenario
Disk
• Data Centers consume 110
Syste Billion kWh per Year;
m
27% • Assume Average Commercial
End User Is Charged ¢9.46 per
kWh
• Disk System Can Account for
27% of the Computing Energy
Other
73% Cost of Data Centers.
Disk System May Have
An Electrical Cost of
2.8 Billion Dollars!
6
7. Existing Energy Conservation Techniques
Software-Directed Power Management
Dynamic Power Management
Redundancy Technique
Multi- speed Setting
How Reliable Are They?
7
8. Contradictory of Energy Efficiency and Reliability
Energy
Efficiency
Reliability
Example: Disk Spin Up and Down
8
9. Presentation Outline
• Motivation
• MINT Model
• MREED Model
• Models Validation
• Reliability Improvement
• Conclusion and Future Work
9
10. MINT
(MATHEMATICAL RELIABILITY MODELS FOR ENERGY-EFFICIENT PARALLEL DISK SYSTEMS)
Energy Conservation
Techniques
Single Disk Reliability Model
System-Level Reliability Model
10
11. MINT (Single Disk)
Disk Age Temperature
Frequency Utilization
Single Disk Reliability Model
Reliability of Single
Disk
11
12. MINT (Single Disk)
R=α*BaseValue[1]*TemperatureFactor+β*FrequencyAdder[2]
α and β are two coefficients to R
Assumption: α = β = 1 in our research
[1] E. Pinheiro, W.-D. Weber, and L.A. Barroso. Failure trends in a large disk drive population. Proc.
USENIX Conf. File and Storage Tech., February2007.
[2] IDEMA Standards. Specification of hard disk drive reliability.
12
13. MINT (Single Disk)
R=α*BaseValue*TemperatureFactor+β*FrequencyAdder
Utilization Impact on AFR Temperature Impact on Transition Frequency Impact on
Temperature Factor Frequency Adder
13
14. MINT (Single Disk)
R=α*BaseValue*TemperatureFactor+β*FrequencyAdder
Frequency=350/Month, T=40°C
Frequency=250/Month, T=40°C
Frequency=350/Month, T=35°C
Frequency=250/Month, T=35°C
Base Value from Google Report[3]
Single Disk Reliability
[3] E. Pinheiro, W.-D. Weber, and L.A. Barroso. Failure trends in a large disk drive population. Proc.
USENIX Conf. File and Storage Tech., February 2007.
14
15. MINT (Energy Conservation Techniques- PDC)
Popular Date Concentration (PDC)[3] - cold data
System Structure
- hot data
[3] E. Pinheiro and R. Bianchini. Energy conservation techniques for disk array-based servers. Int’l Conf.
on Supercomputing, pages 68–78, June 2004.
15
16. MINT (Energy Conservation Techniques- PDC)
Access Rate<MIN(Access Rate) Access Rate>MAX(Access Rate)
Access Rate<MIN(Access Rate)
More Popular Disk Less Popular Disk
Access Rate>MAX(Access Rate)
- cold data
- hot data
16
17. MINT (Energy Conservation Techniques- PDC)
Popular Date Concentration (PDC)[3] - cold data
System Structure
- hot data
(Optimal Result for Certain Time Phases)
17
18. MINT (Energy Conservation Techniques- MAID)
Massive Array of Idle Disks (MAID)[4] - cold data
System Structure
- hot data
[4] Dennis Colarelli and Dirk Grunwald. Massive arrays of idle disks for storage archives.
Supercomputing ’02: Proceedings of the 2002 ACM/IEEE conference on Supercomputing, pages 1–11,
Los Alamitos, CA, USA, 2002. IEEE Computer Society Press.
18
19. MINT (Energy Conservation Techniques- MAID)
Cache Disk Data Disk
Access Rate>MAX(Access Rate)
Massive Array of Idle Disks (MAID)[4] - cold data
System Structure
- hot data
[4] Dennis Colarelli and Dirk Grunwald. Massive arrays of idle disks for storage archives.
Supercomputing ’02: Proceedings of the 2002 ACM/IEEE conference on Supercomputing, pages 1–11,
Los Alamitos, CA, USA, 2002. IEEE Computer Society Press.
19
20. MINT (System-Level)
Access
Disk Age Temperature
Pattern
Energy Conservation
Techniques
Frequency Utilization Frequency Utilization
Single Disk Reliability Model
Reliability of Reliability of
Disk 1 Disk n
System-Level Reliability Model
Reliability of A
Parallel Disk System
20
21. Preliminary Results (experimental setting)
Energy-efficiency File Access Rate File Size
Number of Disks
Scheme (No. per month) (KB)
20 data
PDC 0~106 300
(20 in total)
15 data + 5 cache
MAID-1 0~106 300
(20 in total)
20 data + 5 cache
MAID-2 0~106 300
(25 in total)
Read-only Disks
21
24. MAID under High Access Rate
MAID-1
MAID-2
AFR Comparison of PDC and MAID
Access Rate(*104) Impacts on AFR (T=35°C)
24
25. MAID under High Access Rate
MAID-1
MAID-2
MAID-1
MAID-2
MAID-1
MAID-2
AFR Comparison of PDC and MAID
Access Rate(*104) Impacts on AFR (T=35°C)
25
26. MINT (conclusion)
Mathematical Model for Disk Systems
MINT Study on PDC and MAID
But ...
Data Stripping Mechanism
Energy Consumption Issues
What about RAID? Reliability Issues
Complexity
26
28. MREED Model
(MATHEMATICAL RELIABILITY MODELS FOR ENERGY-EFFICIENT RAID SYSTEMS)
Access Pattern Temperature
Energy Conservation Techniques
Utilization
Frequency Weibull Analysis
Annual Failure Rate
28
29. Weibull Analysis
A Leading Method for Fitting Life Date
Advantages:
Accurate
Small Samples
Widely Used
29
30. MREED Model
(Energy Conservation Techniques- PARAID)
Soft
State
RAID
Gears 1
2
3
Power-Aware RAID (PA-RAID)[5]
System Structure
[5] Charles Weddle, Mathew Oldhan, Jin Qian, An-I Andy Wang.PARAID: A Gear-Shifting Power-Aware RAID.
USENIX FAST 2007.
30
31. Reliability Evaluation(Experiment Setup)
Disk Type Seagate ST3146855FC
Capacity 146 GB
Cache Size Sata 16MB
Buffer to Host Transfer Rate 4Gb/s (Max)
Total Number of Disks 5
File Size 100 MB
Number of Files 1000
Synthetic Trace Poisson Distribution
Time Period 24 Hours
Interval Time (Time Phase) 1 Hour
Power on Hour Per Year 8760 Hours
31
32. Reliability Evaluation
(Disk Utilization Comparison)
Disk Utilization Comparison Between PARAID-0 and RAID-0 at A Low Access Rate (20/hr)
32
33. Reliability Evaluation
(Disk Utilization Comparison)
Disk Utilization Comparison Between PARAID-0 and RAID-0 at A High Access Rate (80/hr)
33
34. Reliability Evaluation (AFR Comparison)
AFR Comparison Between PARAID-0 and RAID-0 at A Low Access Rate (20/hr)
34
35. Reliability Evaluation (AFR Comparison)
AFR
AFR Comparison Between PARAID-0 and RAID-0 at A High Access Rate (80/hr)
35
37. Model Validation
Techniques
– Run the Systems for A Couple of Decades
– The Event Validity Validation Techniques[6]
[6] R.G. Sargent, “Verification and Validation of Simulation Models”, in Proceedings of the 37th conference on
Winter Simulation, ser. WSC’05 Winter Simulation Conference, 2005.
37
38. Model Validation
Challenges
Unable to Monitor PARAID Running for Years
Sample Size is Small from A Validation
Perspective (e.g. 100 Disks for Five Years)
38
39. Model Validation (DiskSim[7] Simulation)
File To Block Level Converter
[7] S.W.S John, S. Bucy, Jiri Schindler and G.R. Ganger, “The DiskSim Simulation Environment Version 4.0
Reference Manual”, 2008
39
40. Model Validation (DiskSim Simulation)
Diagram of the Storage System Corresponding to the DiskSim RAID-0
40
41. Model Validation (Result)
Utilization Comparison Between MREED and DiskSim Simulator
41
42. Model Validation (Result)
Gear Shifting Comparison Between MREED and DiskSim Simulator
42
44. Recall PDC
Popular Date Concentration (PDC) - cold data
System Structure
- hot data
(Optimal Result for Certain Time Phases)
44
45. Problem of PDC
The Most Popular Disk:
High AFR
No Replica
45
46. Reliability Improvement of PDC
Method of Improving Reliability
Mirroring
Extra Disks for Replication -> More Energy Consumption
Disk Swapping
Swap Existing Disks
46
47. Disk Swapping Scheme
PDC
Swap the Most Popular Disk with the Least Popular Disk
47
50. Preliminary Results (experimental setting)
Energy-efficiency File Access Rate File Size
Number of Disks
Scheme (No. per month) (KB)
20 data
PDC 0~106 300
(20 in total)
15 data + 5 cache
MAID-1 0~106 300
(20 in total)
20 data + 5 cache
MAID-2 0~106 300
(25 in total)
• Read-only Disks
• Mean Time to Data Lose (MTTDL)
• Swapping Thresholds (2*105, 5*105, 8*105 No./Month)
• Single Swapping
50
51. Comparison of Disk Swap
PDC
AFR Comparison of PDC
Access Rate(*104) Impacts on AFR (T=35°C)
Threshold = 2*105 No./Month
51
52. Comparison of Disk Swap
PDC
AFR:
Swap2 < Swap1 < No Swap
AFR Comparison of PDC
Access Rate(*104) Impacts on AFR (T=35°C)
Threshold = 2*105 No./Month
52
53. Comparison Between Different Threshold
PDC
AFR Comparison of PDC
Access Rate(*104) Impacts on AFR (T=35°C)
Threshold = 2*105 No./Month
53
54. Comparison Between Different Threshold
PDC
AFR Comparison of PDC
Access Rate(*104) Impacts on AFR (T=35°C)
Threshold = 5*105 No./Month
54
55. Comparison Between Different Threshold
PDC
AFR Comparison of PDC
Access Rate(*104) Impacts on AFR (T=35°C)
Threshold = 8*105 No./Month
55
56. Comparison Between Different Threshold
PDC
AFR
Higher Threshold -> Lower AFR
AFR Comparison of PDC
Access Rate(*104) Impacts on AFR (T=35°C)
Threshold = 2*105 No./Month, 5*105 No./Month, 8*105 No./Month
56
57. Limitations
• Read Only Disk Scenario
• Data Migration within Certain Time Phases
• Simple File Access Patterns
57
58. Future Work
Extend the Models to investigate mixed read/write
workloads;
Research the trade-offs between reliability and energy-
efficiency;
Extend schemes to a real-world based environment;
Develop a multi-swapping mechanism
balancing the utilization & lowering the failure rate;
Evaluate more control groups.
58
59. Conclusion
• Generic Models coupled with power
management optimization policies;
• Two reliability models for the three well-known
energy-saving schemes -- PDC, MAID and PARAID;
• Disk swapping strategies to improve disk
reliability for PDC.
59
Hot data--the data has increasing access rate Cold data--the data hasn’t been accessed for a while-- has decreasing access rate Most popular disk-- stores most of the hot data Least popular disk-- stores most of the cold data
Cache Disks only handle the data copied in while Data Disks only handle the data copied out
many of the applications are read-only eg. server system like youtube.com, more downloading than uploading
Utilization Sensitivity: PDC is higher than MAID; AFR: when the access rate is low, PDC is lower than MAID the reason is on the following slide
from access rate-utilization model, the higher the access rate is, the higher the utilization of the system will be, however from the utilization-AFR model,
The AFR of MAID will NOT keep decreasing At high access rate, AFR of MAID-1 is higher than that of MAID-2 the reason is on the following slide
Workload: 1, file accessing 2, file movement 3, parity data (for level 5)
problems that needs to be solved
problems that needs to be solved
problems that needs to be solved
problems that needs to be solved
many of the applications are read-only eg. server system like youtube.com, more downloading than uploading
Utilization Sensitivity: PDC is higher than MAID; AFR: when the access rate is low, PDC is lower than MAID the reason is on the following slide
Utilization Sensitivity: PDC is higher than MAID; AFR: when the access rate is low, PDC is lower than MAID the reason is on the following slide
Utilization Sensitivity: PDC is higher than MAID; AFR: when the access rate is low, PDC is lower than MAID the reason is on the following slide
Utilization Sensitivity: PDC is higher than MAID; AFR: when the access rate is low, PDC is lower than MAID the reason is on the following slide
Utilization Sensitivity: PDC is higher than MAID; AFR: when the access rate is low, PDC is lower than MAID the reason is on the following slide
Utilization Sensitivity: PDC is higher than MAID; AFR: when the access rate is low, PDC is lower than MAID the reason is on the following slide
why use benchmark to validate: unpractical to run in the real-life; sample size is still small why only validate access rate-utilization model: utilization-AFR base on maintenance data from report
why modeling reliability of PARAID? RAID with energy saving scheme failure of disk will have more problems: total data lose(raid 0) or more time and energy consuming for data recovering(raid 5)
why not improving reliability of PARAID? Reliability of RAID with energy saving scheme is still under research Level 0, hardly no improving space Level 1, need to find the balance point for energy vs. Reliability Level 5, complexity, performance, energy, and reliability