Raid 1 on debian

Feel free to download this PDF... Ref: Tutorial size 53.79Kb

What to do when a drive fails

The status of the raid disks is monitored continually by mdadm, and you can set it up to email an alert if one of the drives fails. If that happens, here's what you do. NB: This is based on what I've read in docs; I haven't actually had to test this, so proceed at your own risk. Presume that it is /dev/hda that has failed:

Remove the faulty disk from the array. This involves removing each of the partitions. Make certain that you're removing the correct disk -- the faulty one! Removing the good disk will result in a very unhappy rest of the day.
mdadm --set-faulty /dev/md0 /dev/hda1
mdadm --remove /dev/md0 /dev/hda1
mdadm --set-faulty /dev/md1 /dev/hda5
mdadm --remove /dev/md1 /dev/hda5
mdadm --set-faulty /dev/md2 /dev/hda6
mdadm --remove /dev/md2 /dev/hda6
mdadm --set-faulty /dev/md3 /dev/hda7
mdadm --remove /dev/md3 /dev/hda7
mdadm --set-faulty /dev/md4 /dev/hda8
mdadm --remove /dev/md4 /dev/hda8

Shutdown and power off the box.
Physically remove the failed drive.
Install a new drive.
Restart the box. It should boot to the raid device -- and the new drive will show up as missing.
Use mdadm to add in the new drive as before. It appears that this automagically formats the new disk and copies all the data. However, it may be necessary first to copy over the good disk's partitions as we did before, and there certainly can be no harm in going through the formatting steps for the new drive.
Confirm via cat /proc/mdstat that the raid has rebuilt itself using the new drive.