[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
RE: [cobalt-users] Do RAID failures just go away????
- Subject: RE: [cobalt-users] Do RAID failures just go away????
- From: "Ligard, Vidar" <vligard@xxxxxxxxx>
- Date: Wed Sep 3 09:15:01 2003
- List-id: Mailing list for users to share thoughts on Sun Cobalt products. <cobalt-users.list.cobalt.com>
> -----Original Message-----
> From: Josh Kuperman [mailto:josh@xxxxxxxxxxxxxxxxxx]
> Sent: Wednesday, September 03, 2003 9:07 AM
> To: cobalt users
> Subject: [cobalt-users] Do RAID failures just go away????
>
>
> Last night I ssh'ed in and was checking my email. I couldn't
> help but notice large amounts of it were missing along with a
> slew of symlinks and the main web page address returned an
> empty directory in a browser. I'm on a RaQ XTR using RAID 0;
> the was a notice mailed out to admin, that one of the disk
> drives had failed. All the directories when I did "ls ../../"
> weren't there (which of course couldn't be correct since I
> was in my home directory when I logged in.)
>
> There were some other oddities. My system load was at 2.00 as
> opposed to a usual 0.15 and there were two runaway processes,
> instances of VIM, owned by a user who hadn't logged in for a
> while. They were using 48% of cpu each according to top.I
> killed those.
>
> Now when I get in today I see that it all looks OK -- except
> for some nightly log rotations, vanishing squid log files,
> and wherever mutt stuck my mail. (I'm assuming a temporary
> folder someplace). I assume I'll find more problems once I
> know where to look.
>
> So I have the following questions:
>
> 1. Is the disk bad or good - how do I tell?
> 2. Was the runaway VIM a sign of hacking, bad coding, or
> irrelevant? 3. Is there anyway to tell what would have been
> damaged by temporarily having one disk out of the loop?
>
I can't give you absolute answers, but here are some of my observations on my Raq XTR:
2. I have had some runaway VIM processes that take up all the resources. During this time, I couldn't really do much or trust results from other commands until I killed those VIM processes. I have seen them 'hard' to kill as well, i.e. I've had to send the kill signal a couple of times or try a different signal (can't exactly remember which). Those VIM processes were generally left out there if I had SSHed in, and my firewall timed out the connection, leaving SSH and VIM 'hangin out there'. Once I killed the process(es), everything was fine. I do not remember who it said the process belonged to, but I would assume it was myself.
1. I had my server tell me a RAID disk (raid 5) was bad here a little while back ( about two months I guess). I rebooted, the server rebuild the raid, and all has been fine since. You might want to check the archives on this one. I know someone had it happen more frequently than me, but honestly can't remember if there was a remedy.
Vidar
> --
> Josh Kuperman
> josh@xxxxxxxxxxxxxxxxxx
>
> _____________________________________
> cobalt-users mailing list
> cobalt-users@xxxxxxxxxxxxxxx
> To subscribe/unsubscribe, or to SEARCH THE ARCHIVES, go to:
> http://list.cobalt.com/mailman/listinfo/cobalt> -users
>