Some slides on the original design of RAID, a Redundant Array of Inexpensive Disks. Demonstrates the tradeoffs between the varying RAID levels and gives some historical context.
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
Overview of Redundant Disk Arrays
1. Andrew Robinson
University of Michigan
<androbin@umich.edu>
Redundant Arrays of
Inexpensive Disks (RAID)
What a cool idea!
2. Authors
• David A Patterson
• Garth Gibson
• Randy H Katz
Officially published in 1988.
3. Overview
• What is RAID?
• Why bother?
• What is RAID, really?
• How well does it work?
• How’s it holding up?
4. What is RAID?
• Take a bunch of disks and make them appear
as one disk.
• Put data on all of them
• Use all at once to gain performance
• Duplicate data to gain reliability
• Buy cheap disks to gain dollars
7. CPUs and Memory kept getting faster…
• Exponential growth everywhere!
• CPU Performance: 1.4X increase per year
– More transistors
– Better architecture
• Memory Performance: 1.4-2X increase per
year
– Invention of caches
– SRAM technology
8. … but disks did not.
• It’s hard to make things spin exponentially
faster every year (they tend to fly apart).
• Disk seek time improved at a rate of
approximately 7% a year.
• Caching had been employed to buffer I/O
activity, this works reasonably well for
predictable workloads.
9. Slow I/O Makes Slow Computers
• Amdahl’s Law describes the impact of only
improving some pieces, while leaving others.
1
S=
S – The effective speedup
F – Fraction of work in faster mode
(1- f ) + f / k K – Speedup while in faster mode
10. …really slow.
• If applications spend 10% of their time in I/O,
when computers are 10 times faster, they will
only appear 5% faster.
Something needed to be done.
11. What should we do?
• Single Large Expensive Disks (SLED) are not
improving fast enough.
• Larger memory or solid state drives weren’t
practical
• Small personal hard drives are emerging… can
we do something with those?
14. Why didn’t someone do this before?
• Standards like SCSI have finally allowed drive
makers to integrate features seen in
traditional mainframe controllers.
15. There is a problem…
• A hundredfold increase in number of disks
means a hundredfold increase decrease in
total reliability
MTTFSingleDisk
MTTFDiskArray =
nDisks
17. A couple levels… a single idea
• RAID manages the tradeoff between
performance and reliability
• RAID comes in levels (RAID1 to RAID5)
• These levels represent points in the
performance reliability space
18. Groups, Disks, and Check Disks
• RAID organizes disks into groups of reliability
• Some of the disks in a group store error
correcting data
D = Total disks with data
G = Disks in a group
C = Number of check disks in a group
19. Metrics
• Useable Storage – Percent of storage that
holds data, excluding parity information
• Performance – Tough to make one number:
– Reads, Writes, and Read-Modify-Write Access
Patterns
– Sequential and Random Data Distribution
20. RAID1 – The Naive Approach
• Mirroring of all data
• To read:
– Use either disk
• To write:
– Send to both disks
simultaneously
• Minor read
performance increase.
21. Evaluation
Pros Cons
• Reads can occur • Useable storage is cut in
simultaneously half
• Seek times can improve • All other performance
with special controllers metrics are left the same
• Predictable performance
Alright for large sequential jobs and transaction
processing jobs
22. RAID2 – Bit Level Striping
• Uses Hamming Code for Error Detection
• Requires many check disks
– For 10 data disks, 4 check disks
– For 25 data disks, 5 check disks
• Can detect errors, and determine the at-fault
disk
24. Evaluation
Pros Cons
• Better useable storage, 71% • Dismal small random data
for G=10, 83% for G=25 access performance: 3-9%
of RAID1 or SLED
Good for large sequential jobs, bad for transaction
processing systems.
25. RAID3 – Byte Level Striping
• Simpler parity error correction
• Only a single check disk required for error
detection
• Cannot determine which disk failed, but that’s
usually pretty obvious
• Transfers of large continuous blocks is good
27. Evaluation
Pros Cons
• Even better useable • Small random data access
storage, 91% for G=10, 96% performance: Just as bad as
for G=25 RAID2
Even better for large sequential jobs, bad for
transaction processing systems.
28. What is parity?
• Parity is calculated as an XOR of the data
blocks.
• XOR is reversible:
– 1011 (A1) XOR 1100 (A2) => 0111 (AP) “parity”
– 0111 (AP) XOR 1011 (A1) => 1100 (A2)
– 0111 (AP) XOR 1100 (A2) => 1011 (A1)
• This makes error detection and reconstruction
possible!
29. RAID4 - Block Level Striping
• Like RAID3, but more parallelly
• Interleave data at sector level rather than bit
level
• Allows for servicing of multiple block requests
by different drives
• Still keeps all the parity information on a
single drive
31. Evaluation
Pros Cons
• Finally better small random • Small writes, and read-
access. Reads are fast! write-modifies are still slow.
Good for large sequential jobs, still not great for
transaction processing systems.
32. RAID5 – Block Level Striping with
Distributed Parity
• Instead of checksums on a single disk, we
distribute them across all disks.
• Allows us to support multiple writes per group
34. Evaluation
Pros Cons
• Really good usable storage • Slightly worse write
• Finally decent small random performance, data must be
data access performance written to two disks
across the board! simultaneously
Finally, a system that works well for both applications!
36. As a Whole
• RAID has many different levels that achieve
different tradeoffs in reliability and
performance
• Almost all of them, for some (or many) use
cases will outperform a SLED for the same
cost.
40. RAID has held up remarkably well
• Data centers around the world use RAID
technology.
• The small, inexpensive disk is the de facto
standard of storage
• The ideas developed for RAID have been
applied to many not-RAID things
41. Some open questions
• What will become of RAID as new, super fast
storage mediums start to become cost
effective?
• How does it fit in with massive internet-scale
storage farms?
42. Take Aways
• RAID offers significant advantage over SLED for
the same cost
– RAID5 offers 10x improvement in performance,
reliability, and power consumption while reducing size
of array.
• RAID allows for modular growth (add more disks)
• Cost effective option to meet challenge of
exponential growth in processor and memory
speeds
43. References
• “A Case for Redundant Arrays of Inexpensive
Disks” by David A Patterson, Garth Gibson,
and Randy H Katz
• “RAID: A Personal Recollection of How Storage
Became a System” by Randy H Katz
• Slides by David Luo and Ramasubramanian K.
• Images generously borrowed from Wikipedia
<http://en.wikipedia.org/wiki/RAID>
----- Meeting Notes (1/21/12 13:53) -----Invented around 1987.
----- Meeting Notes (1/21/12 13:53) -----Patterson - BerkeleyGibson – Currently at CMUKatz - Berkeley
Exploits clever XOR trick to not require reading data off of all the disks to recalculate parity.Each small write requires 2 disks and 4 accesses, 2 reads and 2 writes.Each small read requires only 1 access.