Saturday, November 3, 2012

Post Hurricane Sandy RAID Rebuild

I am fortunate that where I live did not suffer much damage in the wake of the recent storm named "Sandy".  I think that we maybe got some 40-50 MPH winds and a fair bit of rain from the storm, but no major damage was done.  Most of our power lines are buried underground in this area, so I was happy that we never lost power during the storm.  We did, however, lose power the day after the storm had passed.  Probably as a side effect of the power company working to restore power for those who had lost it during the storm.

After power was restored, I went around the house turning on all of my computer and server equipment.  I didn't really do a thorough check, though.  Today, I went to put a file on my NAS and noticed that my NFS mount was not present on my workstation.  I tried mounting it manually and it just hung.  I tried pinging the NAS and got no response.  It was powered on, though.  It was time to hook up a monitor and keyboard to this usually headless server.

As soon as the monitor came up, I could see the problem.  The system was sitting on the GRUB menu screen.  This screen usually has a timeout, that when reached, will boot the default selection.  This time, though, there was no timeout.  I thought to myself that something must be wrong.  I proceeded to make the selection and allow the system to boot.

As it booted I noticed that it said my software RAID array was in a degraded state and something about an invalid partition table.  I chose to let it boot anyway.  Once the system was up and running, I logged in and was able to determine that the RAID member with the problem was /dev/sda. 

Below are the steps I used to remove the array and add it back to begin rebuilding the array:

  • mdadm --manage /dev/md127 --fail /dev/sda1
  • mdadm /dev/md127 -r /dev/sda1
  • mdadm --zero-superblock /dev/sda
  • mdadm /dev/md127 -a /dev/sda1

Now I'm using the next command to view the status of the rebuild:

  • watch cat /proc/mdstat

All I can do at this point is wait for the rebuild to complete.  Maybe one day I'll invest in a nice hardware RAID controller.

No comments:

Post a Comment