To RAID or not to RAID...

Jayce

Fully Optimized
Messages
3,056
Location
/home/jason
I just built up a small Linux based home server that will be doing quite a bit, including OwnCloud, Subsonic music streaming, 24/7 video surveillance, file, print, backup, and maybe someday (the end goal at least) is to have all of my media on here and have my HTPC stream it from the server. Right now it's sitting at 2x500GB, non-RAIDED, just in a singular format. The primary drive rsyncs to the other every night, which kind of serves as a backup in a way.

I considered on getting significantly larger hard drives, such as 4x2TB HDDs. I was considering doing some type of software based RAID array utilizing mdadm in Linux. I've used mdadm before with great success, but only with mirrors. I did some reading about the different types of RAID and I'm somewhat confused about the true benefits with a home setup, even a home setup that does quite a bit. Through the levels I decided if I would go with RAID, 6 or 10 would be the best options, both of which would fit nicely with 4x2TB HDDs, giving me 4TB of actual usable space with some degree of redundancy. The thing I'm a little confused about is this URE I keep hearing about. Some people talk about how risky it is and how it can take down your entire array during a re-sync after a failed drive, etc. I began to wonder, especially when you consider the cost and everything else, is RAID even worth it? This is the way I look at it...

Let's say I get 2x2TB HDDs. One is primary, the other gets rsync'd to nightly. If drive A dies, I can just adjust /etc/fstab to mount drive B in drive A's spot. Then bingo - all of my services that utilize data on that drive will magically come back online. RMA the bad drive, when it comes in, plug it in, an rsync it up and it'll take over the backup position. Okay, great.

The other advantage to RAID is the fact I'd get larger arrays out of it. That way I could get a 4TB array and work with it accordingly. If I go the non-RAID but 4x2TB route, I would have two volumes of 2TB that are usable. Sure, I guess it wouldn't matter because in Linux the mount points can appear transparent anyway (I have Samba hosting data from 2 different drives and you would NEVER know the difference). In that event even if I don't RAID but I end up with 4x2TB, the cost would be the same, so in that instance it wouldn't matter much. I could always get 2x2TB and then later add two more if need be... I suppose that's not a bad idea.

I guess I'm just trying to figure out is the benefits vs cost vs risk vs whatever worth it? I've talked to some people who love their RAID 5 or their RAID 6 or whatever they have. Others are like, why waste the space? Either way, I'll likely buy the same number of hard drives since I want to utilize them either in an array or a backup scenario, but eh. Too many thoughts on the table... much re-structure... Hit me with your thoughts, guys.
 
Backup can deal with more situations than RAID (accidental deletions, corruption etc) but has drawbacks (loosing a days worth of data, slow to recover etc).

URE is a hardware failure so ideally, RAID would help mitigate this risk...

Your paragraph 3 reads even better with a RAD1:
Let's say I get 2x2TB HDDs. They are in RAID1. If drive A dies, I can do nothing. Then bingo - all of my services that utilize data on that drive will magically still be online. RMA the bad drive, when it comes in, plug it in. Okay, great.

*nix mount points are transparent but if you are moving files between them, much slower than a single filesystem.

I've used software RAID 1s as well as hardware RAIDs no problem. I also back up my important data.

If backup is more important than RAID (and it is in many cases) you could do something like:
2 disks - 1 is a backup of the other
3 disks - 2 in RAID1, 1 to backup the array
4 disks - 1 live RAID1, 1 backup RAID1 (brutal redundancy)
4 disks - 3 disks in RAID5, 1 for backup (backup cannot fit all the data, though you could compensate for this through compression)

Sorry, I probably haven't made the decision any easier.
 
Thanks for the insight. I actually considered doing the 3 drive idea with two being a mirror. I guess my only hesitation is if I do four drives 2tb with no raid and just do nightly syncs I'd have 4tb to play with, but to do the 3 drive plan with a mirror, and scale it up accordingly, we're talking six drives to hit 4tb usable space. I guess that's the trade off for some level of redundancy.

Also, I'm not understanding how a ure would be safer with a raid setup. If you get hit with a ure when the raid array is syncing you could lose that array, whereas with a regular hard drive you'd have a corrupt file, no?

EDIT - just talking out loud, it sounds like the bigger the array, the more probability you have of getting hit with a URE. Someone calculated that even a 2x2TB RAID 1 mirror has a 1 in 6 chance of getting hit with a URE. This really begs the question, why am I bothering with RAID? I mean, yeah, I would get a higher chance of uptime and whatnot, but I guess I look at it like this...

Drive A = storage (/dev/sdb)
Drive B = storage_II (/dev/sdc)

Drive A fails.
umount /media/storage
umount /media/storage_II
mount /dev/sdc /media/storage
RMA Drive A
When Drive A is received, A will become the new B.
mount /dev/whatever-the-RMA-drive-is /media/storage_ii
Let it rsync accordingly and bingo bango, I have a fresh backup with a raw uncompressed copy of all of my data.

And I'm back and running with data that was backed up (depending on the time of day) at any point between 23 hours ago to 1 minute ago. Perhaps I'm over thinking this? I'm also not admittedly well versed in what a URE actually does. I understand a URE is a broken sector, but when rebuilding RAID arrays broken sectors are catastrophic. I guess I compare URE's like burning an ISO on a DVD... where you must use the slowest option to ensure a good burn of the image. If you choose a fast option and it skips, there's no going back - that particular bit of information is now bad and it'll present its ugly face to you when you're trying to utilize that DVD. Maybe if a URE just resulted in a corrupt file for whatever is sitting on that sector and moved on I would care significantly less, but to implement a RAID setup with enough disk space that I want to have available *AND* to have additional disks to keep backups of that RAID... we're talking some serious bacon. Hmm...
 
Last edited:
Thanks for the insight. I actually considered doing the 3 drive idea with two being a mirror. I guess my only hesitation is if I do four drives 2tb with no raid and just do nightly syncs I'd have 4tb to play with, but to do the 3 drive plan with a mirror, and scale it up accordingly, we're talking six drives to hit 4tb usable space. I guess that's the trade off for some level of redundancy.

4TB over 6 dives with 100% redundancy (along with the compulsory backup is about the level you should be thinking).

Also, I'm not understanding how a ure would be safer with a raid setup. If you get hit with a ure when the raid array is syncing you could lose that array, whereas with a regular hard drive you'd have a corrupt file, no?

No, at least I don't think so. A URE is a problem on a disk, the only thing that's going to help you is RAID. The problem comes where the number or distribution of multiple UREs effectively exceed the spec of the array, this effects parity driven RAIDs more than mirrored ones.

EDIT - just talking out loud, it sounds like the bigger the array, the more probability you have of getting hit with a URE. Someone calculated that even a 2x2TB RAID 1 mirror has a 1 in 6 chance of getting hit with a URE.

Not exactly complicated maths, consumer SATA drives are good for about 11TB, this is the point at which the chance of a URE approaches 1. So 2TB/11TB ~ 1/6, this is true for the RAID 1 or a single disk.
The good news is that the RAID 1 might save your data whereas a single disk definitely wouldn't.
For the avoidance of doubt, the more disks you have and the higher capacity those disks are, the more likely you are to run into trouble. Redundancy can help you out of that trouble, but backup is always required.
 
kmote, I appreciate your continued insight. I hope I don't sound like a broken record but I am trying to wrap my brain around this as best I can. There's just more that goes into a RAID setup than what I expected.

So if a URE is a problem on a disk, and a URE is more common to come by given a higher number of reads of the disk, is it out of the question to wonder why a drive that could be, say, 12 years old that's been running 24/7 is still fine and dandy? I know part of it may be luck of the draw and that perhaps that drive somehow didn't fully read the entire thing a dozen times over front to back, but I still think it gives enough reason to wonder.

I also guess you have to go into this full well knowing your disks will fail. I always looked at hard drives as a more temporary component of a computer for this reason alone, but I was also hoping to go into this server and let it as-is for quite a number of years. That's why I was aiming so high with the disk space. So from my standpoint, I'm looking at potentially a RAID 6 setup and thinking it will be pretty sweet to start. But as drives fail over the years and I replace them, having to re-sync each one is only going to raise my chances of a URE. Raising the chances of a URE during the re-sync, based on what I read, would raise the chance of a failed array. A failed array is a total bust, at which point I'll have to redo the entire array and pull a backup to place onto the array once synced. However, we could be talking 24 hours of a re-sync process as well. For a home server, sure, this may not be a big deal, but if anything else it's food for thought.

I do however understand that a mirror would be safer in regard to UREs. I was reading that with a mirrored array, if a URE pops up on drive A, the system should be intelligent enough to seek out that exact sector on drive B and replace the data accordingly, thereby making a mirrored array less likely to run into a severe road block than a single disk.

It just sounds like RAID gives you far more redundancy, but a few failed drives over the years (and therefore the re-sync process initiating each time) also significantly increases the chances of a URE failing the array in question. It's *this* that makes me wonder, especially because I'm not going to be dishing out the bacon for enterprise drives. I'm eyeing up 2TB WD Greens. But if a URE can have a detrimental impact on a single disk, it also negates my original thought that a single drive could be, somehow, less prone to a similar fate.

Again, thanks for your insight!
 
So if a URE is a problem on a disk, and a URE is more common to come by given a higher number of reads of the disk, is it out of the question to wonder why a drive that could be, say, 12 years old that's been running 24/7 is still fine and dandy? I know part of it may be luck of the draw and that perhaps that drive somehow didn't fully read the entire thing a dozen times over front to back, but I still think it gives enough reason to wonder.

A 12 year old disk is probably 10GB, given that the probability of a URE correlates to the size of the disk, it is reasonable to say that the only reasons that 12 year old disk isn't 200 times more reliable than the 2TB one is a) it's age b) improvements in technology.

I also guess you have to go into this full well knowing your disks will fail. I always looked at hard drives as a more temporary component of a computer for this reason alone, but I was also hoping to go into this server and let it as-is for quite a number of years. That's why I was aiming so high with the disk space. So from my standpoint, I'm looking at potentially a RAID 6 setup and thinking it will be pretty sweet to start. But as drives fail over the years and I replace them, having to re-sync each one is only going to raise my chances of a URE. Raising the chances of a URE during the re-sync, based on what I read, would raise the chance of a failed array. A failed array is a total bust, at which point I'll have to redo the entire array and pull a backup to place onto the array once synced. However, we could be talking 24 hours of a re-sync process as well. For a home server, sure, this may not be a big deal, but if anything else it's food for thought.

Agree totally, except about RAID 6 - you should prefer RAID 10.

I do however understand that a mirror would be safer in regard to UREs. I was reading that with a mirrored array, if a URE pops up on drive A, the system should be intelligent enough to seek out that exact sector on drive B and replace the data accordingly, thereby making a mirrored array less likely to run into a severe road block than a single disk.

A competent parity RAID implementation would surely do this upon detecting a URE. The problem occurs when a URE is undetected until the array is rebuilt.

It just sounds like RAID gives you far more redundancy, but a few failed drives over the years (and therefore the re-sync process initiating each time) also significantly increases the chances of a URE failing the array in question. It's *this* that makes me wonder, especially because I'm not going to be dishing out the bacon for enterprise drives. I'm eyeing up 2TB WD Greens. But if a URE can have a detrimental impact on a single disk, it also negates my original thought that a single drive could be, somehow, less prone to a similar fate.

I think you're over thinking this. The only time you'd be kicking yourself for using a RAID is when as many drives fail as your array can take and one of the others has a URE and that URE prevents rebuilding. By this stage you would have been kicking yourself several times over for not going with a RAID.

You work in a server environment, right? Why not ask one of the hardware guys what they think? I'm sure that they would tell you that, for production servers, RAID is basically essential.
 
Yeah, I do work in a server environment, however our servers that have any degree of storage are also massive, massive SANs that come with built in controllers and pre-configured RAID with hot swap drives. Then of course every other server we have just maps into those SANs accordingly in the event they need it. I'm not sure the server guys do much tinkering with one RAID vs another, as I think we're talking RAID 50 or RAID 60 here. Plus with that being an enterprise environment, RAID is a no brainer and disks are on hand in the event of an expected failure.

Also, why is it that you would recommend RAID 10 over RAID 6? I did a significant amount of digging around about 6 and 10. From what I can tell, RAID 10 can take a one drive hit, MAYBE two if the 2nd drive is the "right" drive. Meanwhile, RAID 6 can have any two drives fail and still recover fine. Since 10 has nothing to offer except better performance, but 6 is still pretty dang fast, I was leaning more towards 6. That said, I am now remembering that RAID 10 rebuilds significantly quicker than 6... Also, whether I go RAID 6 or 10, based on my calculations I'm looking at 4TB of usable space when you consider 4x2TB HDDs (again, WD Greens). Just highlighting that in case it matters in the deciding factor of 6 or 10.

Appreciate your time and insight!

EDIT -

I began to consider some other alternatives. To purchase 4x2TB HDDs (WD Reds) I'm looking at 500 dollars flat. That's high enough that it just makes me disinterested in upgrading anything all together. As a result, suddenly I'm not sure how much I want to do a RAID 6 or RAID 10 with four drives. That being said, I can get two 3TB Seagates for 280 bucks shipped, which is 1TB less than the 4x2TB RAID option, but 3TB still gives me a decent chunk of space. As a result the mirrored option is becoming much more welcoming...
 
Last edited:
Just to recap, I ended up deciding to go with a mirror since two hard drives (even 3TB) were cheaper than four 2TB drives that would have been considered in the RAID 6/10 plan. I started to learn more about UREs and how their existence may or may not effect a RAID array. The bottom line is, when a URE lands, a regular hard drive is going to continually try to re-read that data. As time passes (we're talking quite a few seconds here), the controller thinks the drive has failed or become unresponsive, so it drops it from the array. This is where you'd be up the creek if you had a RAID 5, no parity left, newly installed HDD and you hit a URE with drives that would continually try to read that bad block of data. It would be reasonable to suggest that if you have non TLER/CCTL drives that your chances of a URE landing a detrimental hit would be higher. Drives with TLER and CCTL enabled (such as WD Red HDDs) are designed to not continually grind on that specific block of data that's not readable. Instead, after a few seconds (7 if I recall), it bypasses that block and keeps on truckin.

It's good to know that a URE won't be as detrimental of a hit on an array as long as you keep on replacing drives in the event they fail, not to mention if you utilize drives best suited for that job. Some people were very against me suggesting that I wanted to use WD Greens for RAID, and now that I've spent some time reading about the pros and cons, I can see why. A WD Green would be my first choice if I needed a singular low level HDD for other purposes, though.

I'll be running 2x3TB WD Red HDDs in my server in a mirror with mdadm. I thought about having my video surveillance feeds just record to the main array, but the more thought I gave it, the more I began to think that I may throw in a single 1TB HDD to dedicate to my 24/7 video surveillance, just because those feeds take up a ton of space and they're very disposable (the likelihood of a HDD failure + something happening where I need to turn over video surveillance to police is almost zilch.. knock on wood). I also may invest some cash into an SSD for the server OS, as it's running on a 500GB 2.5" HDD (7200 RPM at that) that I can surely utilize somewhere else.

At any rate, thanks for the insight! I certainly learned a lot and feel confident in my decision. Just so you guys can see how I was comparing costs vs "what you get":

2x3TB = 300 dollars (WD Red) Usable space in RAID 1: 3TB
4x2TB = 480 dollars (WD Red) Usable space in RAID 6: 4TB

180 dollars more for 1TB extra? Sure, a little added redundancy with two parities, but 480 was high enough that I just couldn't stomach it. 3TB is a truckload of space for a good redundant price and leaves me with a decent amount of headroom.
 
Back
Top Bottom