So I woke up the other day and the server was making a LOT of noise. My first thought was that it was a fan going out [still think it is actually], so I tried 'fmadm faulty'. I was hoping to see something telling me which of the 8 fans were dying. Instead, I see that I was in the midst of a zpool failure.
Checking 'zpool status' I was able to verify that one of the disks had died. I'm running raid-z2, so it isn't like it was a hectic problem to solve... but I learned long ago it is better to fix it now than to wait until a couple more die.
Normally, you would have to go buy a replacement disk. Luckily for me, this Asus box never did recognize the last 3 drive bays. Counting across (there are 10 hot-swap SATAII drives -- want to make sure to pop the right one) I took a guess which one was failing and which ones were not currently in use. Luckily I was right.
Now, the "new" drive already had data on it because it used to be part of the root mirror before I upgraded them to larger drives. That being said, I was a little confused when the new drive wasn't being recognized.
I found this page which helped dramatically. 'cfgadm' showed that drive '6/0' was not configured. I ran 'cfgadm -c configure sata6/0'. It now showed up, but it said it was 'unavail' and 'corrupted data'. 'zpool online' didn't work because of those errors. Finally, I managed to get it to start working with 'zpool replace -f data c6t0d0'. It took quite awhile for it to finally finish. 'fmadm faulty' still showed the fault but I was able to fix that with the zpool clear that it recommended.
I'm thinking I should hook up one of my woot-off lights to flash whenever fmadm faulty shows a failure...
No comments:
Post a Comment