[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[cobalt-developers] load average above 100
- Subject: [cobalt-developers] load average above 100
- From: "David Koopman" <dave@xxxxxxxxxxxxxx>
- Date: Wed Aug 15 06:55:12 2001
- List-id: Discussion Forum for developers on Sun Cobalt Networks products <cobalt-developers.list.cobalt.com>
I had a problem. On my RaqXTR, I noticed the load average at 60 and
raising. YIKES, I've never heard a load avg that high. I tried a
'shutdown -r now' and it said it was going down for reboot now but it
didn't. Then I tried a 'ps -aux' and didn't see anything unusual. I tried
a 'top' and didn't see anything unusual. I do not understand what is
happening. I tried killing processes. Still, the load average kept
climbing. It was climbing about 2 every minute. The problem became worse.
When the load average was at 130 (really!), I tried a '/etc/rc.d/init.d/halt
reboot' -- then all the services ceased to work. I could still ping the
machine, though. And, after 10 minutes, I could still ping the machine, but
it had not ever rebooted. The lcd panel said it was rebooting, but it would
never complete. So then, I shutdown the machine by holding the power button
for 5 seconds, then turned it back on. When it came back up, telnet will
not work, FTP will not work and e-mail will not work. The web admin
interface did, thank goodness. When I look at active monitor, it says that
RAID is doing this: "Server data is being duplicated to the backup hard
drive." When I tried to go to the control panel area of web admin and turn
on Telnet (which was marked as off - it gave this error: "Cannot read
/etc/inetd.conf, /etc/inetd.conf is locked" RAID rebuild finished
successfully. inetd.conf file never unlocked. I rebooted the machine from
the web admin panel. Worked fine, except the inetd.conf file is still
locked (or so it said).
I had to get to the machine physically, and attach a labtop to it to get a
command prompt. When I looked in the /etc folder, there was no inetd.conf
file at all! There was a inetd.conf.master, which I copied over and then
telnet worked again. However, there were several corrupted mail files,
several corrupted MySQL database tables and DNS files were a mess. I had to
rebuild these areas. So messy. What happened? How can I prevent it from
happening again?
Dave.