Today I worked with a friend who just purchased some new hard drives to setup a new RAID 5 array. The array we built used 4x1.5TB external USB drives. We began by unpacking and inspecting the drives. We plugged each of them into power and opened gparted to begin reformatting them. We discussed his long term goals for the drives and he expressed a preference towards XFS, as opposed to the standard ext3 filesystem.
We reformatted each of the drives and created a single XFS partition on each. We also added a RAID flag to each of the drives. He and I have had conflicting experiences with RAID and flags, I was under the impression that the flags needed to be added manually, and he was under the impression that the RAID flag was added automatically by mdadm. I'm still not entirely sure what this flag controls or indicates.
We then told MDADM to create the raid array using the following command
We used /dev/md1 because I had an existing RAID array as /dev/md0, and we used [fghi] which indicates to bash that we wish to provide the list of drives "/dev/sdf1 /dev/sdg1 /dev/sdh1 /dev/sdh1". This is a really useful shortcut for specifying lists of drives.
The first issue we encountered came when we checked a special file called /proc/mdstat. When this file is read, it displays the status of all raid devices managed by mdadm on the system. The output indicated that the process of building the array was at 115747736/1465135936, which meant it had about 1.4TB to process. This was a little alarming because the raid was supposed to end up at 4.1-4.5GB depending on whether 1KB=1000B or 1KB=1024B (see Binary Prefixes). We attempted to mount the filesystem and mount reported that it worked, but when we attempted to access the filesystem, we encountered read errors.
Back to Step One
Based on these issues we re-formatted all of the drives by deleting the existing partitions and created them using ext3. We then repeated step two. This time we remembered a couple of things that helped us to finally achieve the results we were working towards. The first thing we realized was that the 1.4TB of data to process didn't represent the entire array, it represented the amount of parity information to be calculated. The second thing we realized was that we needed to create the filesystem on the raid. Not being willing to experiment more this evening, we created an ext3 partition on top of the array, and everything worked just fine. The array will be degraded for the next 33-50 hours while all of the parity is calculated. See the output of /proc/mdstat on my system:
md1 : active raid5 sdi1 sdh1 sdg1 sdf1
4395407808 blocks level 5, 64k chunk, algorithm 2 [4/3] [UUU_]
[=>...................] recovery = 7.9% (115747736/1465135936) finish=3859.3min speed=5827K/sec
md0 : active raid5 sdb1 sdd1 sdc1
1465143808 blocks level 5, 64k chunk, algorithm 2 [3/3] [UUU]
unused devices: <none>