It’s quiz time: what’s the single most important thing in your Mac computing environment? Is it the actual Mac model you use? Is it the size of the display attached to the Mac? The speed of the CPU? The amount of memory? The size of the hard drive? The speed of and memory on the graphics card? The network connectivity?
While all of those may seem important, I would argue that the most important thing in your Mac computing environment isn’t any of those things. In fact, it’s nothing you can directly touch or see. It’s your data. Whether that data is photographs or text documents or code or 3D CAD drawings or digital artwork, it’s your data that’s truly important. All the other stuff is hardware, and is easily replaced. Your data is irreplaceable. That’s why you have
a backup plan, right?
But just having a backup doesn’t solve all the problems: If a drive dies, the process of replacing it and restoring from the backup can be time consuming. If you use your Mac for your livelihood, this downtime can be expensive. It’s also just a pain, restoring from backups and testing the restored data.
The problems with backups get even worse as you have more and more data: the more data you have, the more time required to restore that data when a drive fails.
Big data is also big messy
In addition to the restore issues, if you have tons of data strewn across both internal and one or more external drives, you know how messy it is to store lots of data. Desk space, power adapters, and connectors for the external drives are messy. Backup plans are messy. Deciding what to store where is messy. Replacing a failed external drive is messy. Restoring from a backup is messy. Migrating to a new Mac is messy.
The above perfectly describes me, at least the semi-recent me: I’ve got about 5TB of data, and until recently, I had a total of one internal and three external drives to store it all. Yes, I know there are now 6TB drives, but when your data builds up over time, you add drives as needed, leading to an inefficient storage solution. After years of doing it this way, I was sick of it.
So I went looking for a solution to all of these issues. In particular, I wanted a storage solution that would:
Prevent data loss (as best as possible)
Minimize downtime in the event of drive failure
Provide speedy data access
Have lots of room for future growth
Ease the upgrade process when buying a new Mac
With those objectives in mind, after a lot of research, I chose a hardware-based RAID solution: I replaced multiple external drives with a LaCie
5Big Thunderbolt 2 using RAID 10. (If you’re fully RAID-aware, skip the following gory details and jump right to Pros and Cons.)
A brief primer on RAID
RAID is an acronym for
Redundant Array of Independent Disks, which is a storage method that combines multiple physical disk drives into one virtual drive. A RAID setup can be faster than a single disk, and provide drive failure protection you can’t get with a single disk. But deciding how to implement RAID can be a complicated process.
Hardware vs. software RAID
The first thing I had to decide on was hardware versus software RAID. A hardware RAID is a separate computer that runs the RAID. You connect the RAID box to your Mac, install software that lets OS X talk to the box, and you’re off and running.
A software RAID is one that is managed by your Mac, though the disks can (and usually will be) in an external box. Until El Capitan, you could use Disk Utility to create a software-based RAID, as described in
How to configure a cheap, secure RAID backup system. In El Capitan, you’ll need to use a third-party app such as
SoftRAID, as Apple has removed this functionality from Disk Utility. (If you’re Terminal-inclined, the diskutil command can also be used.)
But which is best? Software RAID is easier and much cheaper; hardware RAID is powerful but more expensive.
This table summarizes the key differences between hardware and software RAIDs. In a nutshell, if you have the funds, hardware RAID is a better solution. Because I was investing for the long term, I chose the hardware RAID path.
The other key decision is which RAID level to use, but to understand RAID levels, you need to know just a bit more about how a RAID actually stores data on its drives.
RAIDs can use a mix of
striping (storing data across multiple disks),
mirroring (duplicating one disk to another), and
parity (enabling drive rebuilding via redundancy). Striped disks are really fast, mirrored disks are redundant, and parity helps recover the lost contents of either striped or mirrored drives.
All of these configurations mix together in something called the
standard RAID levels, which is good reading if you’re having trouble sleeping at night. The most-basic levels are RAID 0, which stripes data across disks for speed without any redundancy, and RAID 1, which mirrors data across disks for redundancy. The linked article covers the rest of the standard RAID levels.
However, because I wanted both speed and protection, I decided to create a
nested RAID, i.e. one that uses a combination of the standard levels. I chose to use RAID 1+0 or just RAID 10. RAID 10 requires four drives, with each pair of drives first being mirrored (RAID 1), then striped (RAID 0). (Note that to a given RAID level (nested or not), the hardware you’re using must support it.)
This image may help you understand the setupor it may make your head spin.
The end result is that RAID 10 is speedy (because data is written to multiple drives) and redundant (because the data is mirrored). In a four-disk RAID 10 array, two discs can fail at once without losing the arrayas long as they’re not a drive and its mirror (i.e. Disk 0 and Disk 1 in the image). The combination of speedy access, redundancy, and support for two drive failures at once is why I decided to use RAID 10.
Pros and cons
I’ve been using my RAID for about a year now, which has given me a good amount of time to learn both the pros and the cons of my setup. First, the good parts.
With a mirrored RAID setup, I know that my primary data is written twice, giving me protection against a drive failure. (Before everyone starts yelling, yes, RAID is not a backup! Everything on my RAID is also backed up on removable external drives.)
In this sense, my RAID has performed admirably: I have had a drive failure, and I essentially didn’t even notice it. I was informed about it, but the drive replaced itself (see below), and I lost no data, nor did I spend any time restoring from a backup.
Easy recovery from drive failure
The 5Big RAID box I chose holds five drives. Four are actively used in the RAID, and the fifth is a “hot spare” which will automatically replace a failed drive. Luckilyor unluckily, depending on your perspectiveI had a chance to see this in action, as I had a drive die a few months into my ownership.
When it happened, I received a notification from the RAID, and went to the RAID device’s web management page. There I saw that, in fact, the hot spare had been swapped into the array, and was in the process of rebuilding.
To replace the dead drive, I just pulled it out and inserted the new drive, all without powering down. The newly-added drive became the new spare. And since that first drive died, I’ve had no other drive failures. It doesn’t get much easier than that for failure recovery.
My RAID 10 box benchmarks out at roughly twice the speed of an external USB 3 drive, which is what I’d otherwise be using. It’s nowhere near the speed of the internal SSD, of coursebut then again, I can’t store 5+TB of data on the internal SSD.
While not as speedy as the SSD, the RAID is more than fast enough for my general use. I keep nearly everything on the RAID, too; only most-used applications and some work files reside on the SSD.
Room for expansion
My RAID box has a formatted capacity of 8TB; with my current storage needs, I still have over 4TB of space for future growth. At some point, if that becomes constraining, it should be possible to swap the 4TB drives in my box for 6TB drives, which would give me 12TB of capacity. If even that becomes constraining, I could switch to another RAID level that provides more storage (at the expense of some speed). In short, my RAID box should easily last as long as I need it to without getting full.
So much for the good stuff. What about the not so good?
There are no two ways around it: hardware-based RAID solutions can be expensive, and I chose a big box with a big price tagyou can buy a mid-range 27-inch Retina iMac for what I paid for a box of disks. I did not make this decision lightly, but I decided that my data was worth the extra expense of making sure if was as safe as I could make it.
The particular box I chose, a 20TB model, comes equipped with five 4TB drives. But the only way you get 20TB out of the box is if you run all five disks as RAID 0, which provides no data protection (though great speed). Configured as RAID 10, I get 8TB of usable space. How does capacity drop from 20TB to 8TB? Quite easily: Two 4TB drives are used for data storage, two more are used to mirror that data, and the last is the hot spare.
I’m paying a lot of money for storage that I’m either not using at all (the spare), or that I’m using to hold copies of my data (the mirror disks). But that’s the setup I chose, knowing I wanted to keep my data as safe as possible.
Single point of failure
When you have multiple drives, if one fails you don’t have to replace everything, just that one drive. With my RAID, I’m mostly protected against drive failures, but the box itself becomes a single point of failure. If the power supply goes out, I’m shut down until I can get it replaced. More-expensive RAID boxes can be equipped with dual power supplies, for just such an eventuality. My RAID doesn’t have that feature, so failure of the power supply is a concern. (It is, however, an external power brick, so replacing it should be easy, if I ever have to do so. I could even order a spare, just to have it on hand.)
To ease the power supply’s job, I leave the RAID turned on all the time, though it sleeps when it can and when the Mac sleeps. I also have it plugged into an uninterruptible power supply, so a power outage won’t force a jarring shut-downor power-up, when the power returns.
Requires drivers to operate
Many third-party RAID boxes require drivers to work with OS X; the LaCie is such a box. If you connect it to a Mac without the LaCie drivers installed, you won’t see the drive at all. In this case, the drive’s OS X software is just as important as the hardware. And as you can see, there’s quite a bit that gets installed.
Overall, the LaCie software is OK. You manage the RAID through a webpage, which works well enough, though it looks like it was written in 2003. The bigger issue is the system-level software: when major updates are released, you have to insure that the software will work before you upgrade.
This was an issue with the recent El Capitan upgrade, as LaCie didn’t come out with officially-supported drivers until a few weeks after El Capitan’s release. (In my testing, the driver actually worked fine if it was already installed, but you couldn’t install a RAID box as new within El Capitan.)
I am, essentially, beholden to LaCie to keep updating their software for future OS X compatibility. Of all the cons, this is the one that concerns me the most, because a lack of updated software could turn a completely functional RAID into a useless box of disks.
Wrapping it all up
Is my solution for everyone? Absolutely not; if you don’t have a ton of data, and don’t mind the downtime in restoring from backup, there’s no reason to even consider it. But if you do have a ton of data, and/or if you’d like to minimize downtime in the event of drive failure, a hardware RAID is a reasonable solution.
There are tradeoffs involved, and hopefully my discussion of those tradeoffs can help you make your own decision. Personally, I’ve been thrilled with the RAID and its performance and data handling, so I’m OK with some of the risks involved. With some help from LaCie’s software department, hopefully it can be my forever storage solution.