[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [cobalt-users] Qube2 Crashes without obvious cause
- Subject: Re: [cobalt-users] Qube2 Crashes without obvious cause
- From: Malcolm McLeary <mmcleary@xxxxxxx>
- Date: Mon Jan 15 04:28:01 2001
- List-id: Mailing list for users to share thoughts on Cobalt products. <cobalt-users.list.cobalt.com>
Hi Guys,
on 15/12/00 7:31 AM, Malcolm McLeary at mmcleary@xxxxxxx wrote:
> Its an appliance ... I need to minimise downtime ... I intent to swap the
> disk into the new unit and restart it ... let the OEM worry about what the
> real problem is.
Over the Christmas break this Qube performed just great ... the problem is
there were no users in the office. Once staff returned it started to hang
again. :-(
I arranged for a new box to be delivered to the customer then swapped the
disks. The idea being to see the problem is the disk (or software) or the
hardware ... within an hour it hung. Hence I concluded that the problem was
on the disk ... either the disk, the software or the customers data.
Next step was to put the disks back in their respective boxes, configure the
new box and transfer the customers data.
As soon as I got the new box on-line and began testing client workststions
the new Qube hung. This time I had a telnet session running top and
captured the top process at the time of the crash ...
5:02pm up 1:19, 1 user, load average: 0.38, 0.49, 0.35
70 processes: 65 sleeping, 5 running, 0 zombie, 0 stopped
CPU states: 37.2% user, 43.5% system, 73.1% nice, 20.0% idle
Mem: 63396K av, 62484K used, 912K free, 35528K shrd, 8152K buff
Swap: 130748K av, 176K used, 130572K free 37260K cached
PID USER PRI NI SIZE RSS SHARE STAT LIB %CPU %MEM TIME COMMAND
2102 yu-lan 19 10 1116 1116 932 R N 0 64.9 1.7 0:03 imapd
1729 root 16 10 1560 1560 1248 S N 0 7.9 2.4 3:26 smbd
1938 root 10 0 744 744 596 R 0 3.9 1.1 0:17 top
2103 root 8 0 404 404 340 R 0 1.3 0.6 0:00 chat
430 root 5 0 752 752 584 S 0 0.7 1.1 0:33 diald
511 root 1 0 292 292 240 S 0 0.7 0.4 0:04 update
2098 admin 10 10 1284 1284 944 S N 0 0.3 2.0 0:00
in.proftpd
2 root 1 0 0 0 0 SW 0 0.1 0.0 0:01 kflushd
601 root 0 0 860 860 684 R 0 0.1 1.3 0:02
in.telnetd
1563 root 0 0 1216 1216 928 S 0 0.1 1.9 0:00 sendmail
1 root 0 0 384 384 296 S 0 0.0 0.6 0:00 init
3 root -12 -12 0 0 0 SW< 0 0.0 0.0 0:00 kswapd
4 root 0 0 0 0 0 SW 0 0.0 0.0 0:00 md_thread
5 root 0 0 0 0 0 SW 0 0.0 0.0 0:00 md_thread
6 root 0 0 0 0 0 SW 0 0.0 0.0 0:00 nfsiod
7 root 0 0 0 0 0 SW 0 0.0 0.0 0:00 nfsiod
8 root 0 0 0 0 0 SW 0 0.0 0.0 0:00 nfsiod
IMAP seems very busy ... is this "normal".
Having a closer look at the inboxes I found that a couple were large, but
not unreasonable. Having a look at the users home directories I found that
each of the users had created quite a few mail folders, some of which were
quite large.
Normally I wouldn't think any of this would cause a problem, but after
having a look at the customers workstations I found that OutLook 2000 was
configured to check ALL folders for new mail ... this is really dumb as new
mail only ever appears in ones inbox.
Anyway, is it possible that IMAP is causing the hang when too much is asked
of it?
Perhaps I'm barking up the wrong tree, but this Qube is generally stable out
of hours when there are no users and has been know to hang while the users
are in a meeting or out to lunch when nothing is happening but OutLook
continues to check for mail.
Perhaps its OutLook 2000 which is the problem as all my other customers are
happy with OutLook Express. However I have a problem with the concept of a
mail client bringing down a server. Actually it would be nice to blame
Microsoft, but my customer can't live without Office 2000 and in particular
OutLook 2000.
Since this problem has developed over time I have a feeling that it is
related to the gradual accumulation of email. Hence I have suggested to the
customer that they create a bunch of local folders instead of IMAP folders
and see if stability returns.
Any fresh ideas (or the solution) would be gratefully accepted.
Cheers, Malcolm