Conceptual & Actual: RAID

Data is the most important part of any organization. So the organization takes care in preserving the data by all means. The daily data is backed up. The data is also backed up into tapes and the tapes are stored in a distant location. In case of any data center disaster, the data is recovered from the alternate location. These solutions works in recovering the data till the last backup was performed. Many verticals like banks would not be happy with this approach.

Redundant Array of Independent Disks is a way to preserve the data to the latest point. It is sometimes referred to as Redundant Array of Inexpensive Disks. RAID deals with the configuration of the data storage disks. The storage disks will be in smaller blocks and the data will be written into these smaller disks, controlled by the RAID controller. For e.g. if a storage of 500GB is required, it can be obtained by stripping (combining) 7 units of 72GB disks. The controller will have the responsibility of allocating the data into these 72GB disks. For the end user, in this case an OS, it appears like a single 500GB disk. The advantage is that the writing to the disks will happen in parallel and in this case will achieve 7 times faster write speed than a single 500GB disk.

RAID0 – RAID 0 means stripping to get a bigger disk. There is no backup or crash recovery mechanism present in this type of setup. Many smaller disks will be stripped to make a bigger disk, controlled by a controller. This set up cannot sustain a disk failure.

RAID1 – RAID 1 means mirroring. The disk is mirrored. The redundancy of the data is present but in case of a crash of any of the disk, the data can be recovered from the mirrored disk. With higher redundancy and poor performance, data safety from crash is achieved. Failover is also achieved in this setup.

RAID 3 – RAID 3 is a case where the data is stored in one location and its parity is stored in a different disk. This works with 3 disks where the data will be written to disk1 and disk2 and the odd/even parity will be written to the parity disk. In case the disk1 fails, using the parity information and the data on disk2, the data on disk1 can be recovered. In this setup, the data redundancy is 33%. The issue in such a setup was that the parity disk had a huge load to handle.

RAID5 – Similar to RAID3 but in this case, the parity disk is rotated between the data disks. In RAID3, we allocated the 3^rd disk as the parity disk. In this setup, the parity disk is rotated between all the disks. For the first time, the 3^rd disk will have the parity information and 1^st and 2^nd will have the data. For the next data written, the 2^nd and 3^rd will have the data and the parity will be stored in the 1^st disk. This continues. A single disk failure can be sustained with this set up without having 50% redundancy like RAID1 but still having 33% redundancy like RAID3.

RAID6 – RAID6 works on dual parity. In this setup the parity information is stored in two disks and the data is also stored in 2 disks. The parity disks are rotated between the available disks like RAID5. RAID6 set up can sustain 2 disk failures. In case 2 disks fail at the same time, the data in those disks can be recovered using the parity information stored in the other 2 disks.

RAID0+1 – RAID 0+1 is merging RAID 0 and RAID 1 in the same order to get the advantages. Firstly all of the available disks are set up as RAID 0 i.e. all the disks are stripped together to achieve the required space. Another exact configuration is done with another set of disks. These to configurations are then configured as RAID1 i.e. mirrored. In other words, it’s setting up 2 RAID0 set up and then performing RAID1 setup on it i.e. mirroring. This set up can sustain the disk failure of any disks as long as the mirror copy is intact. However there is a 50% redundancy of space.

RAID1+0 – RAID1 + o is merging RAID1 and RAID0. The individual disks available are first mirrored. Then these mirrored disks are then stripped together to achieve the required space. This setup can sustain any disk failure (as long as the mirror doesn’t fail) but has 50% redundancy.

Based on the space redundancy, cost implications can be figured out. Based on the possibility of recovery, the availability can be measured. Similarly performance can be measured based on how many places the data needs to be written. RAID1 is better performing as data is written to a single disk and no mirror copy is written but of course with lesser availability. Similar analysis can be done for other RAID levels too.

Conceptual & Actual

Thursday, October 2, 2008

RAID

No comments:

Post a Comment

Search My Blog

Till Now

Who I AM

Labels