3. INTRODUCTION (1/2)
Everyone knows that the global appetite for data storage is growing
at an astonishing rate, that's why traditional storage devices are
being not longer reliable in terms of cost, capacity and efficiency
How to cope with the explosive data growth ?
How to store and preserve data while optimizing storage resources?
How to ensure backups?
4. INTRODUCTION (2/2)
What is a logical drive?
A logical drive :
•is a group of physical disks that appears to your OS as a single drive.
•can comprise one or more disks drives and can use part or all of each
disk’s drive capacity.
•it is possible to include the same disk drive in two different logical
drives by using just a portion of the space on the disk drive in each.
The combination of multiple disk drive components
into a logical unit are made by the mean of a storage
technology : RAID
6. RAID
In 1987, Patterson, Gibson and Katz at the University of California
Berkeley, published a paper entitled “A Case for Redundant Array of
Inexpensive Disks(RAID)”.
The basic idea of RAID was to combine multiple, small inexpensive
disks drive into an array of disk drives which yields performance
exceeding that of a Single, Large Expensive Drive(SLED).
7. ADVANTAGES OF RAID
Using a RAID storage subsystem has the following advantages:
•Redundancy:
• Provides fault-tolerance by mirroring or parity operation.
• Provides disk spanning by weaving all connected drives into one single volume.
•Increased Performance: Increases disk access speed by breaking data
into several blocks when reading/writing to several drives in parallel.
With RAID, storage speed increases as more drives are added.
•Lower Costs:
• Acquiring costs
• Management costs: Floor space Costs, Operational Risk, Lowering Power and
Cooling Costs.
8. FEW TERMS TO
UNDERSTAND
•Data Striping: Data is split across multiple drives in a RAID array to
form a single logical storage unit. Each drive's storage space is
partitioned into stripes, ranging in size from one sector (512 bytes)
to multiple megabytes. The stripes then are interleaved so that the
logical storage unit is made up of alternating stripes from each drive.
•Mirroring: Used in RAID levels 1 and 1+0 for data recovery. Data is
duplicated through mirroring across two disks. If one drive fails, the
data remains available on the other disk. It's sort of like low-end
clustering.
•Parity: Information Used in RAID levels 3, 4 and 5 for data recovery. In
the event of a drive failure, parity information can be combined with
the other remaining data to regenerate the missing information.
9. RAID LEVELS
•Data are distributed across the array of disk drives.
•Redundant disk capacity is used to store parity information, which
guarantees data recoverability in case of a disk failure.
•Levels decided according to schemes to provide redundancy at lower
cost by using striping and “parity” bits.
•Different cost-performance trade-offs.
10. RAID 0 : DISK STRIPPING
•Minimum Disks Required=2
•Capacity=N
•Redundancy=No
RAID 0 provides the highest performance but no redundancy.
Data in the logical drive is striped (distributed) across several physical drives.
11. RAID 1: DISK MIRRORING
•Minimum Disks Required=2
•Capacity=N/2
•Redundancy=Yes
RAID 1 mirrors the data stored in one hard drive to another.
RAID 1 can only be performed with two hard drives.
If there are more than two hard drives, RAID (0+1) will be performed
automatically.
Performing simultaneous reads
High Reliability with fast recovery
12. RAID 0+1: DISK STRIPPING
WITH MIRRORING
•Minimum Disk Required=4
•Capacity=N/2
•Redundancy=Yes
RAID (0+1) combines RAID 0 and RAID 1 - Mirroring and Striping. RAID
RAID (0+1) allows multiple drive failure because of the full redundancy of the
redundancy of the hard drives. If there are more than two hard drives
assigned to perform RAID 1, RAID (0+1) will be performed automatically.
13. RAID 3: DISK STRIPING WITH
DEDICATED PARITY DISK
•Minimum Disk Required=3
•Capacity=N-1
•Redundancy=Yes
RAID 3 performs Block Striping with Dedicated Parity.
One drive member is dedicated to storing the parity data.
When a drive member fails, the controller can recover/ regenerate the lost
lost data of the failed drive from the dedicated parity drive.
RAID 3 is usually used storing large files, such as multimedia, music,
videos, and photos. Performance for writing or reading files larger than
1 MB is dramatically better than in a single-drive, non-RAID system.
14. RAID 5: STRIPING WITH
INTERSPERSED PARITY
•Minimum Disk Required=3
•Capacity=N-1
•Redundancy=Yes
RAID 5 is similar to RAID 3 but the parity data is not stored in one dedicated
dedicated hard drive. Parity information is interspersed across the drive
drive array.
In the event of a failure, the controller can recover/regenerate the lost data of
data of the failed drive from the other surviving drives.
RAID 5 is a good choice for multimedia file storage. Its read speed
can be very high, while the write speed is slightly slower, due to the
need to calculate and distribute the parity.
15. FEW TERMS TO
UNDERSTAND
•Snapshots: comes in three basic flavors: File system based, subsystem
based and volume manager/virtualization based. All three are considerably
different. Snapshots are an extremely important function for business
continuity, but there are a lot of details to work through. You need a strategy
for snapshots as well as a decent understanding of how you will establish
operations to work with them. They will change your daily operations and
they require constant, ongoing administration. Platform specific operations
for flushing cache (file system buffers) matter a whole lot.
•Replication: is the transport of data objects (files -- tables) over a TCP/IP
network. The transfer is made from system to system not between storage
devices or subsystems.
17. DAS
•DAS (Direct Attached Storage) is an architecture for which the storage
is “privately” attached to the servers: cannot be shared, it is hard to
scale, expensive and complex to manage. 80% of the market it is still
DAS
•For an individual computer user, the hard drive is the usual form of
direct-attached storage. In an enterprise, providing for storage that
can be shared by multiple computers and their users tends to be
more efficient and easier to manage.
19. NAS 1/2
•NAS is file-level computer data storage connected to an IP network
providing data access to a heterogeneous group of clients.
•NAS removes the responsibility of file serving from other servers on
the network. They typically provide access to files using network file
sharing protocols such as NFS (popular on UNIX systems), SMB/CIFS
(Server Message Block/Common Internet File System) (used with MS
Windows systems), AFP (used with Apple Macintosh computers), or
NCP (used with OES and Novell NetWare).
20. NAS 2/2
•Easy appliance
•Clustered file-system
•NAS units rarely limit clients to a single protocol
•Not recommended for applications requiring large disk performance
•Heavy usage of CPU
22. SAN 1/2
•A Storage Area Network (SAN) is an independent network for storage
subsystems, free from the rest of the computer network.
•SAN is a dedicated network that provides access to consolidated,
block level data storage.
•It's a networked architecture that provides I/O connectivity between
hosts and storage devices.
•SAN devices: hubs, switches, servers and storage devices implements
a storage resource environment.
23. SAN 2/2
The storage network can be:
•A Fiber Channel network
• Uses a network of Fiber Channel connectivity devices: FC Switches and
Directors
• For transport, an FC SAN uses FCP
• FCP is serial SCSI-3 over Fiber Channel
•Or an IP network
• Uses standard LAN infrastructure: Ethernet switches
• For transport, an IP SAN uses iSCSI
• iSCSI is serial SCSI-3 over IP
24. STORAGE ARRAY
•The concept of a logical volume is very similar to a logical drive. A
logical volume is composed of one or several logical drives, the
member logical drives can be the same RAID level or different RAID
levels.
•The logical volume can be divided into a maximum of 8 partitions.
During operation, the host sees a non-partitioned logical volume or a
partition of a partitioned logical volume as one single physical drive.
•A Volume Group (VG) is the highest level abstraction used within the
Logical Volume Manager. It gathers together a collection of Logical
Volumes (LV) and Physical Volumes (PV) into one administrative unit.
25. FC: FIBRE CHANNEL
•Fibre Channel is a high-speed network technology primarily used for storage
networking.
•Despite its name, Fibre Channel signaling can run on twisted pair copper
wire in addition to fiber-optic cables.
•It has now become the standard connection type for storage area networks
(SAN) in enterprise storage.
•Fibre Channel Protocol (FCP) is a transport protocol (similar to TCP used in
IP networks) that predominantly transports SCSI commands over Fiber
Channel networks.
26. FIBRE CHANNEL PROTOCOL
LAYERS
FC-0 (Couche Physique)
FC-1 (Encode/Decode)
FC-2 (Framing protocol/Flow
Control)
FC-3 (Common Services)
FC-4 (an interface with one ULP)
IP
SCS
I
Audio/Video ULP
FC-UL
FC-PH
28. POINT TO POINT
•This is the simplest topology of the FC SAN, which allows the host
and storage to connect directly.
transmitting speed is high
limitation of the system expansion
Host Storage
29. ARBITRATED LOOP
•One-way loop fashion enables transmitting events between nodes
and nodes.
•It’s designed to scale to a limited number of nodes (up to 127).
Low cost (no interconnecting devices needed)
Limited performance (Arbitration overhead and shared bandwidth)
Host
Storage
Storage
30. SWITECHED FABRIC
•Switched fabric is a computer network topology where many storage
devices connect with each other by means of switches.
Bi-directional connection
High performance ( each logical connection receives dedicated
bandwidth)
Scalable, robust and reliable architecture
Host
Host
Switch
Switch
Storage
Storage
31. IP SAN
•IP SAN is the storage area network doing data transmitting processes
through TCP/IP protocols. Since the protocol commands are
embedded the IP address where the data is transmitted to, IP SAN is
the high-efficient and point-to-point storage solution. There are
some ways to implement SAN by TCP/IP, such as FCIP (Fiber Channel
over IP), iFCP (Internet Fibre Channel Protocol), and the iSCSI (Internet
SCSI), which is more cost-efficient than Fibre SAN.
Low acquisition costs
Commodity economics
Utilizing a proven technology installed at almost every business
site
Wide area connectivity - no interconnect distance limit
32. ISCSI
•iSCSI is a internet protocol standards are officially ratified by Internet
Engineering Task Force, IETF.
•iSCSI technology simplify the storage area network solution, such as
setting time, equipment, and techniques, via the Ethernet interface.
•From the view of the IP SAN topology, hosts are required to receive
and process iSCSI IP packages.
33. ISCSI TOPOLOGY
iSCSI ServeriSCSI Server
iSCSI Server iSCSI Server
iSCSI Tape Library iSCSI Tape Library
iSCSI RAID iSCSI RAID
iSCSI Session
IP Network
34. ISCSI PROTOCOL MODEL 1/2
Ethernet
Header
Ethernet Frame
IP
Header
IP Packet (Datagrams)
TCP
Header
TCP Segment
PDU
Header
PDU Data
Data Data Data
CHK
FCS
35. ISCSI PROTOCOL MODEL 2/2
SCSI Application
iSCSI Transport
Protocol Services
Interconnect
Services
TCP
IP
Data
Link
SCSI Application
iSCSI Transport
Protocol Services
Interconnect
Services
TCP
IP
Data
Link
SCSI Application
Protocol
TargetInitiator
iSCSI Transport
Protocol
IP Network
Interconnect Service Interface
Protocol Service Interface
36. SAN FC VS SAN ISCSI
FC:
•High cost
•High performance
•Low interoperability
ISCSI:
•Low cost
•Low performance
•Standardized
FC and ISCSI can coexist on a storage network
Each one of them meets different needs
37. DAS VS NAS VS SAN
DAS NAS SAN
Storage Type Sectors Shared files Blocks
Data
Transmission
IDE/SCSI TCP/IP,
Ethernet
Fiber Channel
Access Mode Clients or
servers
Clients or
servers
Servers
Capacity
(Bytes)
109 109-1012 >1012
Complexity Easy Moderate Difficult
Management
Cost (per GB)
High Moderate Low
38. CONCLUSION
Choosing the right storage solution is not an easy task especially with
such a variety of storage technologies.
that's why, there are several key criteria to consider include:
Capacity: the amount and type of data (file level or block level)
Performance: I/O and throughput requirements
Scalability: Long-term data growth
Availability and Reliability: how mission-critical are your
applications?
Data protection: Backup and recovery requirements
Budget concerns
Notes de l'éditeur
Performance of the array is also dependent upon the drives. In order for the array to function properly, it must wait for the data to be written to each of the drives before it can continue. This means that in the example charts for the RAID arrays, the controller must wait until all physical data has been written to block 1 across all the drives in the array before it can continue to the next set of data for the drives. This means an array where one drive has half the performance of the other two will slow down the overall performance of the other drives.Three drives of a smaller size could cost less than an individual high-capacity drive but provide more capacity.
Advantages:Increased storage performanceNo loss in data capacityDisadvantages:No redundancy of data
Advantages:Provides full redundancy of dataDisadvantagesStorage capacity is only as large as the smallest driveNo performance increasesSome downtime to change active drive during a failure
Advantages:Increased performanceData is fully redundantDisadvantages:Large number of drives requiredEffective data capacity is halved
Advantages:Increased storage array performanceFull data redundancyAbility to run 24x7 with hot swapDisadvantages:High costs to implementPerformance degrades during rebuilding
Advantages:Increased storage array performanceFull data redundancyAbility to run 24x7 with hot swapDisadvantages:High costs to implementPerformance degrades during rebuilding
First is installing the application software (initiators) and processing the related commands and data through CPU, or using the TCP/IP Offload Engine (TOE) to process IP packages in order to reduce the CPU loading efforts and increase its operating efficiency. Then, IP SAN is not required to install any additional switches. Contract to FC SAN, IP SAN keeps the original circuits to avoid the additional wiring expanses. Comparing with FC SAN, IP SAN reduces not only the complexity of SAN building, but also the actual costs of equipment and cables.