o-s

RAID Definitions – What Levels are there and what should you use?

July 4, 2011

Disk Array – two or more disk drives connected together in such a way as to make it appear that they are one drive.

Software RAID – RAID levels that are provided by the Operating System. It is usually slower than Hardware RAID.

Hardware RAID – Specially built controllers that provide RAID levels through hardware. This is usually faster than the Software RAID that is provided by an Operating System. RAID 0 (better known as striping). Minimum of two drives. Data is interleaved between the drives so that the workload is balanced among all drives in the striped array. Very good performance but there are no provisions for redundancy. If one drive fails, all data is lost. This type of RAID should only be used when you need high-speed storage and aren’t worried about redundancy. For example: a RAID 0 drive can be set up to provide storage for NT’s use as virtual memory or for other temporary files such as the SASWORK subdirectory. If the RAID 0 array fails, you aren’t worried about the data in it since the data are temporary. That‘s why it can be used to provide fast storage for temporary type data.

RAID level 1 (better known as mirroring). Minimum of two drives. I/O writes are written to identical twin disk drives (the twin being the mirror). If one disk fails, the other has the exact same data on it. There are several disadvantages. Every write has to be done twice. So, writing long files can degrade system performance. One of the ways the industry has tried to work around this is to employ a technique known as duplexing. Duplexing can be implemented with a two channel RAID controller or two separate RAID controllers when using NT’s ability to do software RAID. Each mirrored pair is placed on a separate channel or on separate RAID controllers. This basically allows writes to occur almost simultaneously and this speeds up the dual writes. The second problem is you effectively lose half the disk space that you purchased in order to provide the backup. Another way of looking at it is disk space cost twice as much. This type of RAID is best used when redundancy is an absolute must yet fast small random writes are mainly what the system is doing. Best used in small file server systems.

RAID level 0/1 – This is a combination of RAID levels 0 and 1. Minimum of four drives. Each mirrored pair is setup as a RAID 0 array. That is to say, you have a mirrored pair of RAID 0 arrays. You get the benefit of striping from RAID 0 for performance and the redundancy of RAID 1. Long writes would still affect this type of RAID. Duplexing could be used to increase the write performance. This RAID type is sometimes referred to as RAID 10.

RAID level 2 – Intended for non-error detecting drives, it uses error correction codes. Most SCSI drives support built-in error detection negating the need for RAID 2.

RAID level 3 – Uses bytes level striping across several disk drives and stores parity on one disk drive. The parity can be used to rebuild a failed drive. Is very similar to level 4. Best used on small file server.

RAID level 4 – Uses block level striping across several disk drives and stores parity on one disk drive. The parity can be used to rebuild a failed drive. Read performance is better than write performance due to having to write the parity information associated with each write. All parity data are stored on one drive. Good RAID level for larger file servers.

RAID level 5 – Similar to RAID level 4 but distributes the parity data among all the drives in the array which can slow reads slightly but speeds the writes up considerably. This is a good RAID level when reading small files or small parts of files such as a Database server or if the I/O is mostly random. If one drive fails, it can be rebuilt from the parity data.

RAID levels 6 and 7 – Similar to 5 but can recover from 2 or 3 drive failures respectively.

RAID Array Implementation

RAID levels are implemented by software code. What distinguishes the different ways RAID can be implemented is where the software is actually executed. There are basically 3 ways that RAID is implemented:

The operating system (known as Software-Based RAID).

Bus-Based Array Adapters/Controllers

External Array Controllers (also known as “Bridge Controllers”)

Software-Based RAID arrays are implemented through the operating system usually on entry-level servers. Windows NT for workstations provides Software-Based RAID 0, while Windows NT Server provides levels 0, 1 and 5. The RAID code is executed by the systems CPU and the algorithms can be very mathematically intensive. This can slow overall performance since CPU utilization and interrupts will increase in order to perform the RAID operations. Use this information as a guide when deciding what kind of CPU to put in an entry-level server that will implement Software-Based RAID levels. Only CPU’s with the fastest FPU’s (floating-point Unit, a part of the CPU that does the math calculations) should be considered in order to obtain the best performance you can get when implementing Software-Based RAID. Another hint, you usually want more than one processor on your server and there is only one brand of processor that you can do this with. This brand also can’t be beat when it comes to math calculations. Software-Based RAID is considered to be the slowest way to implement RAID levels.

Bus-Based RAID Arrays are implemented by an adapter/controller that fits inside the server case in one of the motherboard’s PCI slots. They generally have one or more secondary processors that perform most of the RAID operations and I/O commands. This helps to increase I/O performance as well as overall system performance because this implementation of RAID does not depend on RAID algorithms being processed by the CPU. Some Bus-Base RAID controllers also have cache built on them which is also reported to further increase the I/O performance. It has been said that this type of RAID implementation is the fastest since the RAID controller resides directly on the PCI bus. Because of cost, these controllers used to be found only on the mid-range to high-end servers. But due to continued research, development and manufacturing cost reductions, they have become available to the entry-level servers.

External Array Controllers are usually located in the external RAID enclosure and hooks into the system through one or more device channels such as a SCSI port. They also have high performance microprocessors right on board and are able to handle the execution of all RAID software code functions without interfering with the operating system. These high performance microprocessors and advanced cache algorithms help many External Array controllers be second in performance only to the Bus-Based RAID controllers.

However, bets are that as Fibre Channel develops and becomes more readily accessible, external Arrays will match and maybe even surpass their Bus-Based counterparts.