Before I left my work in Thailand, I wouldn’t have felt right leaving the system there with a single hard-drive storing all of the businesses important data. With the arrival of my new replacement (the rather brilliant James!), we got talking about potential solutions, and James mentioned that in his previous employment (dealing with mainly Windows based systems) that they were mostly using hardware RAID on a server rack. This got me thinking, how can I get that kind of technology but on a much more basic machine and on a budget? RAID was definitely the answer, but would replacing the motherboard really be necessary?
What is RAID?
RAID is an acronym for ‘Redundant Array of Inexpensive Disks’ (originally) but is probably more commonly known as a ‘Redundant Array of Independent Disks’. Redundancy means that you have more than one piece of equipment, hard-drive, data or just item that does the same job. NASA uses this kind of fail-safe in it’s space missions by having at least 2 of each critical piece of equipment, if one breaks then they can switch to the backup. If it’s good enough for NASA, it’s good enough for your file or media server!
In the case of a RAID it (usually) means having more than 1 hard drive with the same data on it – but the computer it’s attached to will see it as a single ‘logical’ drive. There are different levels of RAID array, but for those of you with file/media servers (like my own personal one) we’re probably going to be looking at making a RAID 1 array. This means that if you were to have a 1TB drive worth of data, then you would need to have 2 x 1TB drives to store that amount of data with redundancy. Double the initial expenditure for the same amount of data? Sounds naff right? There are other options, but they are going to need even more drives! This is probably the cheapest way to give you that redundancy backup you’re going to want to keep your data safe.
Here’s the Wikipedia.org article on RAID for those who want to geek-out in more depth!
Software or Hardware RAID?
For the case that I was working on with James, the cost of buying a new motherboard and then the potential problems that may arise with our existing operating system (Ubuntu 12.04LTS) would have been more expensive and a bit dangerous. I also read that a lot of cheaper motherboards with hardware RAID are in fact, fake RAID. Basically they have an extended BIOS with their own software for creating RAIDs so just aren’t that great – but I haven’t tried them out so I could well be wrong, let me know if I am in the comments!
I then started my research into software RAID arrays with Linux, having not had any exposure to them before. After a lot of googling around, I came to the conclusion that software RAID was the way to go, it meant all we needed was the cost of a new hard drive. If your machine supports hardware RAID then it’s an option you should consider, I don’t know enough about either to really say which is best, so I’ll let you draw your own conclusions.
mdadm (multiple disk administrator)is a great bit of linux freeware (under GNU license) which allows you to create a RAID using hard drives attached to your machines motherboard (or IDE or USB or whatever is listed as a drive if I read up on it correctly). In our case we were using SATA drives (Western Digital Green 2TBs). Having read up a lot of stuff on this, I came to the conclusion that you’re best off using 2 of the same hard drive (ideally buying two or more couples at the same time – you may even get a discount if you buy enough!). The only downside to doing this conversion is that you’re going to lose all of the data that already exists on your drive (more on this later!), so the initial purchase of 1 extra hard drive was not going to be enough – we had to get the data off the existing drive and then copy it back once the RAID was configured – bummer! Luckily for us, I had just purchased my own external USB drive to go on my own home server, so after business closed I got backing up the data from the work file server and made sure that it had all copied across (I used the command: du -sh /path/to/folder to do comparisons). If I was to do this again (and I will be on my own machine eventually when funds allow), I would buy two new drives and copy the data from the existing one to the newly built RAID once configured, but in this case we already had an internal drive ready to go and a new near-identical drive.
- 2 x identical HDDs
- A motherboard with enough spare IDE or SATA ports (I can’t vouch for the safety for trying this out on 2 USB drives or similar – if you have experience let me know how well it works!)
- OPTIONAL – your original drive with data on it – you may have to put it into a USB caddy to get the data copied back if you don’t have enough spare ports left
Sadly I didn’t take enough notes down of the work I did when configuring the array to write my own step by step guide, but this example post pretty well sums up what I ended up doing – except for the part at the end where I copied all of the data back from the backup drive to it’s original location!
N.B. I would recommend going for option 2.2 of choosing to manually add your RAID array to mdadm.conf rather than scanning as this could break any other existing arrays you may have, plus you get to choose its designation (in this example md0) – so the command should read like this:
What could I use this for?
In the future of my media server I hope to start messing about with SSDs. Whilst these are coming down in price (check out Amazon for 1TB sized units now and they’re getting to a near-affordable level for mere mortals like us), the idea of relying on just those is daunting, so the idea of having RAID arrays of spinning disks as a backup or perhaps longer term store for archiving is a great way to go. Or simply to replace all my existing ‘production’ drives with different configurations whilst having the data backed up on spinning disks on a machine elsewhere in the house would be great! The speed of access to the SSDs would help improve performance as media gets hungrier and hungrier for bandwidth over time – the files I’m using now are about 4 times as large as they were 10 years ago when I first started watching media through my PC (a 42 minute American show on PDTV DivX would typically be 350MB – 2 episodes per CD-ROM! A 42 minute show in 720p mkv now comes in at about 1.4GB – 3 episodes per DVD-R!) – this will change as 4K media starts happening! RAID arrays would be ideal for the backup system to give redundancy. If you are running a small family or business server on the cheap – software RAID arrays are definitely a great first step in making sure you have redundancy backups of your data.
Well, that’s enough for now folks – sorry, I had hoped to write this guide myself, but I can’t put it any more simply or succinctly than the guys at mysolutions.it have done!