[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[cobalt-users] Help with unruly raq4



I have a raq4 that has been dieing lately.  It stays up anywhere from
2-6hours at a time.  At first I had a huge list of processes 250+ and then
the server would go down.  It was like processes just were not being killed
off.  There weren't any that seemed to hog the cpu, just a lot of small
ones.  I tried a few things and then just decided to reload the os and get
the sites up from backup.  So last night, I applied all patches, ssh, php,
mysql, portsentry, ipchains, logcheck.  The backups went smooth, no
problems.

Everything was backup and running in about 3 hours.  Then sometime around 3
this morning the server went down.  I rebooted it about 8 am.  I have been
watching it with top all morning trying to get an idea of what is going on.
All the sudden 3.5 hours later, top just stops.  I can't access http or
email, but I can ping the box.  Nothing looks like trouble in top, here is
what was happening when it went down.  I am adding more ram, 512 megs total
soon.  I have poured through all the logs and haven't noticed anything out
of the ordinary at the times it went down.  I am running top, a coworker is
running top and was trying to login with ssh when it went down this time.
Is that the user nobody for ssh?

What can I do to find out if this is somekind of hardware problem?

Thanks for any help you can offer.



11:08am  up  2:36,  2 users,  load average: 0.07, 0.16, 0.15
57 processes: 56 sleeping, 1 running, 0 zombie, 0 stopped
CPU states:  8.6% user,  2.3% system,  0.5% nice, 88.3% idle
Mem:   127776K av,  102964K used,   24812K free,  207668K shrd,   23148K
buff
Swap:  131532K av,       0K used,  131532K free                   14880K
cached

  PID USER     PRI  NI  SIZE  RSS SHARE STAT  LIB %CPU %MEM   TIME COMMAND
10654 nobody    18   0  1236 1236  1064 S       0  4.9  0.9   0:00 sshd
 3798 httpd      2   0  9756 9756  8320 S       0  1.5  7.6   0:02 httpd
10652 root      11   0  1412 1412  1188 S       0  1.3  1.1   0:00 sshd
10563 root       4   0   872  872   676 R       0  1.1  0.6   0:01 top
 3947 root       4   0   880  880   676 S       0  0.9  0.6   1:17 top
 1520 root       2   0  1656 1656  1328 S       0  0.5  1.2   0:17 sshd
 1956 root       0   0  1656 1656  1328 S       0  0.1  1.2   0:07 sshd
    1 root       0   0   472  472   400 S       0  0.0  0.3   0:03 init
    2 root       0   0     0    0     0 SW      0  0.0  0.0   0:01 kflushd
    3 root       0   0     0    0     0 SW      0  0.0  0.0   0:00 kupdate
    4 root       0   0     0    0     0 SW      0  0.0  0.0   0:00 kswapd
    5 root     -20 -20     0    0     0 SW<     0  0.0  0.0   0:00
mdrecoveryd
  395 root       0   0   524  524   424 S       0  0.0  0.4   0:02 syslogd
  404 root       0   0   768  768   384 S       0  0.0  0.6   0:00 klogd
  434 root       0   0   616  616   508 S       0  0.0  0.4   0:00 crond
  446 root       3   0   484  484   412 S       0  0.0  0.3   0:00 inetd
  475 named      0   0  2376 2376  1080 S       0  0.0  1.8   0:04 named
  480 root       5   0  1016 1016   908 S       0  0.0  0.7   0:00 sshd
  491 root       0   0  5792 5792  4760 S       0  0.0  4.5   0:01
httpd.admsrv
  514 root       0   0  6436 6436  5044 S       0  0.0  5.0   0:00
httpd.admsrv
  519 root       0   0  8584 8584  7660 S       0  0.0  6.7   0:04 httpd
  585 postgres   5   5  1336 1336   936 S N     0  0.0  1.0   0:00
postmaster
  713 root       0   0   824  824   660 S       0  0.0  0.6   0:00
safe_mysqld
  737 root       0   0  2580 2580  1196 S       0  0.0  2.0   0:00 poprelayd
  748 mysql     10  10  2016 2016  1372 S N     0  0.0  1.5   0:00 mysqld
  754 mysql     13  10  2016 2016  1372 S N     0  0.0  1.5   0:00 mysqld
  755 mysql     10  10  2016 2016  1372 S N     0  0.0  1.5   0:00 mysqld
  770 root       0   0   128  128   108 S       0  0.0  0.1   0:00 lcdsleep
  809 root       0   0   460  460   384 S       0  0.0  0.3   0:00 getty
 1467 root       0   0  6384 6384  5072 S       0  0.0  4.9   0:00
httpd.admsrv
 1523 root       0   0   936  936   736 S       0  0.0  0.7   0:00 sh
 1962 root       0   0   964  964   752 S       0  0.0  0.7   0:00 sh
 2234 root       0   0  1760 1760  1324 S       0  0.0  1.3   0:00 sshd
 2236 root       0   0   748  748   480 S       0  0.0  0.5   0:00
sftp-server
 2764 root       1   0  1360 1360  1008 S       0  0.0  1.0   0:00 sendmail
 3711 root       0   0  6288 6288  5032 S       0  0.0  4.9   0:00
httpd.admsrv
 3794 httpd      0   0  9556 9556  8240 S       0  0.0  7.4   0:02 httpd
 3796 httpd      0   0  9680 9680  8256 S       0  0.0  7.5   0:02 httpd
 3797 httpd      0   0  9676 9676  8272 S       0  0.0  7.5   0:02 httpd
 3799 httpd      0   0  9508 9508  8260 S       0  0.0  7.4   0:02 httpd
 3803 httpd      0   0  9552 9552  8240 S       0  0.0  7.4   0:02 httpd
 3813 httpd      0   0  9504 9504  8248 S       0  0.0  7.4   0:02 httpd
 3815 httpd      0   0  9676 9676  8264 S       0  0.0  7.5   0:02 httpd
 3835 httpd      0   0  9780 9780  8280 S       0  0.0  7.6   0:02 httpd
 3836 httpd      0   0  9572 9572  8240 S       0  0.0  7.4   0:02 httpd
 3837 httpd      0   0  9488 9488  8244 S       0  0.0  7.4   0:02 httpd
 3838 httpd      0   0  9604 9604  8248 S       0  0.0  7.5   0:02 httpd
 4225 httpd      0   0  9580 9580  8240 S       0  0.0  7.4   0:01 httpd
 4257 httpd      0   0  9728 9728  8256 S       0  0.0  7.6   0:02 httpd
 4510 httpd      0   0  9680 9680  8248 S       0  0.0  7.5   0:02 httpd
 5004 httpd      0   0  9488 9488  8236 S       0  0.0  7.4   0:02 httpd
 5005 httpd      0   0  9496 9496  8244 S       0  0.0  7.4   0:01 httpd
 5006 httpd      0   0  9456 9456  8232 S       0  0.0  7.4   0:02 httpd
 5007 httpd      0   0  9556 9556  8236 S       0  0.0  7.4   0:01 httpd
 5008 httpd      0   0  9488 9488  8244 S       0  0.0  7.4   0:01 httpd
 7560 root       0   0  2292 2292  1268 S       0  0.0  1.7   0:01 sendmail
10623 nodakbet   0   0  1384 1384  1028 S       0  0.0  1.0   0:00
in.proftpd