[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[cobalt-users] sendmail bogs down
- Subject: [cobalt-users] sendmail bogs down
- From: Parker Morse <morse@xxxxxxxxxxx>
- Date: Mon May 12 12:19:01 2003
- List-id: Mailing list for users to share thoughts on Sun Cobalt products. <cobalt-users.list.cobalt.com>
We had a(nother) sendmail crash this weekend on our Qube3. I'm trying to
figure out what's causing these so I can prevent another.
Sendmail apparently went down between 11:45 and midnight on Friday night.
I noticed there was a problem when I looked at my work email on Sunday
(rare) and noticed a number of cron jobs had emailed errors: fcheck was
choking on the md5sum of something (which happens regularly, always a
different file), a weekly mysqldump backup of several sites on another
server had failed, and my script which counts DNSBL rejections reported
none(!) for the previous day... not likely.
Also, the load average on the server was up around 6, which is very
unusual.
Telnet to port 25 would establish a connection, but there was no sendmail
banner.
ps showed two sendmail processes, but kill [pid] wouldn't shut them down!
When I tried /etc/rc.d/init.d/sendmail stop, it would report
stopping mail service: sendmail ERROR!ok
and then trying sendmail start would just say
starting mail service:
In hindsight, I should probably have checked "mailq" at this point to see
if there was a problem there. Instead, I rebooted the box, which cleared
up all problems. It is now running normally. I scoured the logs (maillog
and messages) this morning and didn't find anything interesting - just a
dramatic reduction (to zero) of outside connections around midnight, and a
depressing amount of incoming spam before that.
I'd like to figure out what is causing this so I can avoid it in the
future, or at least recover from it more gracefully than by rebooting the
server. I searched the archives for "sendmail shutdown" and "sendmail not
responding" but so far I haven't found anything which matches my situation.
I have found some posts implying that sendmail will shut down at a
certain load average; is it possible that a combination of a hung fcheck
process and a big blob of spam (using both sendmail and spamd) could have
produced a load average high enough (>15) to shut it down? Also, those
messages suggested that Active Monitor would attempt to restart sendmail,
and that apparently wasn't happening.
Any ideas?
Thanks,
pjm