In my last post, I gave an overview of what RAID is, and I mentioned that there are different types, or levels, of RAID arrays. In today's installment, I'd like to get into what's meant by that and some of the different types. The arrays are differentiated from each other based on how the hard drives interact with each other to provide redundancy, or if they provide it at all.


The common types of RAID levels are identified by numbers, with 0, 1, 5, and 6 being the most common. There are a few oddball levels that are identified in other ways, but they're pretty niche concerns, excepting JBOD. JBOD stands for, literally, Just a Bunch of Disks. This is when all the drives are seen individually and simply as separate hard drives. Literally, it's just a bunch of disks, nothing more. Who says computer engineers have no sense of humor?


The simplest RAID level to understand is probably RAID 1, also known as a mirror, and requires two hard drives. As you might guess from the name, one drive is a mirror of the other, or a copy to be more accurate. All the data is written identically to both drives, so that in the event of one drive failing, the data is still intact on the other drive.


Backing up a step, RAID 0 is bit of a strange one, compared to its cousins. RAID 0 requires at least two drives, and can include many more. In this array, each chunk of data is split up into two (or more) pieces that are simultaneously written to each hard drive. This increases data transfer speed because you're not limited to the speed of a single drive. The downside of this is that if you lose a drive, you lose all your data, because the data can only be reconstructed with a complete set of the pieces.


RAID 5 is a more common implementation in server environments. RAID 5 takes at least three disks. To describe what RAID 5 is will take a tiny bit of algebra—wait, don't go! It won't hurt, I promise!


Still with me? Excellent. Okay, each chunk of data is split up and spread through the drives. Let's say we have three hard drives. Two drives get pieces of the chunk of the data, and the third drive gets what's called parity, which is the sum of the other two pieces. So to visualize, we have a + b = c. Now here's where the magic comes in. If one drive fails, the RAID controller can recreate the data using that formula. Drive 2 fails? Now we have a + x = c. Solve for x and you have your data. A downside is that you lose one hard drive's worth of space to storing the parity information, and speed suffers, but you won't mind that when a drive goes down and your data is intact.


RAID 6 is like RAID 5, but it stores parity twice. If you can sit through some more algebra, a + b = x and a + b = y. The advantage here is that you can lose up to two drives without data being impacted. The downside is you do have to sacrifice two hard drives worth of space, but the extra security may be worth it.


In RAIDs 1, 5, and 6, a good controller should automatically begin rebuilding the array information when the failed hard drive is replaced. You can either do this when the failure occurs, or if you have space use what's called a “hot spare,” which is a hard drive already in place, designated to be used as a replacement whenever a drive in an array fails. It's like having a tire go flat and the spare tire automatically take its place.