[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [cobalt-users] XTR Raid 5 hard drive failure



> -----Original Message-----
> From: B McCoy [mailto:bcmccoy42@xxxxxxx] 
> Sent: Monday, September 22, 2003 11:29 AM
> To: cobalt-users@xxxxxxxxxxxxxxx
> Subject: [cobalt-users] XTR Raid 5 hard drive failure
> 
> 
> I am getting a warning that drive 2 (from left) has failed. 
> Machine seems to be stable. I recently applied the latest 2 
> Cobalt patches ( a week or so
> ago) and the ssh patch.
First off, there should be some posts on this a couple months or three back.
I had the monitor tel me the same thing, and it was wrong.
> 
> The last time I had a problem like this it was a kernel 
> problem and we were told to replace the drive. That didn't 
> work and we had to do a complete restore. Then a couple of 
> days later Cobalt acknowledged the kernel problem. The 
> replaced drive was never bad it seemed and active monitor was 
> giving faulty errors.
> 
> 1.Is there a way to confirm that this drive is bad and not 
> rely solely on active monitor?
I had this problem, shut down, pulled the drive out, put it back in, rebooted, and all has been fine since then. (not sure if the pulling out and resetting was necessary, but I did just in case there should be some poor contact points.)
Once I rebooted, it picked up the drive and started rebuilding it. The server will be up while it is rebuilding, so your only downtime is during the reboot.
> 
> 2. If it is bad what would happen if I replaced it with the 
> first supposedly bad drive with the original data on it.
That would be the way to go.
> 
> 3. Running raid 5 what happens if I remove the drive or 
> simply reboot?  ( I am running 4 30 gig hard drives raid 5) I 
> don't want to lose the machine on a weekday.
I would do the reboot first off, to see if this is a false alarm. See #1 above.
> 
> 4. In case I can't find the original drive ( we recently 
> moved and it may be hard to find) does anyone know the brand 
> and specs of the drives that were original equipment. I think 
> they 30 gig Western Digital ID but that's all I can remember. 
> My machine is co-located off-site.
You can find out what your drive it with this command
[root logs]# cat /proc/ide/ide2/hde/model
ST380021A
[root logs]# cat /proc/ide/ide2/hde/media
disk
[root logs]# cat /proc/ide/ide2/hde/geometry
physical     155061/16/63
logical      155061/16/63
[root logs]# cat /proc/ide/ide2/hde/cache
2048
[root logs]# cat /proc/ide/ide2/hde/capacity
156301488
> 
> CPU - hardly being used
> Memory 1 gig with 32% used.
> 
> Hard drive use
> OS and Programs 91.8 mb available 41.5 MB used 54%
> Virtual Sites  81018.0 MB available  78387.5 MB used 3.2%
> Home 79641.2 MB available 77765.6 MB used 2.0%
> 
> Thanks for any input!
> 
> _____________________________________
> cobalt-users mailing list
> cobalt-users@xxxxxxxxxxxxxxx
> To subscribe/unsubscribe, or to SEARCH THE ARCHIVES, go to: 
> http://list.cobalt.com/mailman/listinfo/cobalt> -users
>