[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[cobalt-users] Reoccuring Server Outage/Failure
- Subject: [cobalt-users] Reoccuring Server Outage/Failure
- From: Charlie Clemmer <cclemmer@xxxxxxxxxxxxxxxxxx>
- Date: Tue Jan 14 21:49:01 2003
- List-id: Mailing list for users to share thoughts on Sun Cobalt products. <cobalt-users.list.cobalt.com>
Trying to troubleshoot, and I don't like the fact that this server is far
far away (hosted by CobaltRacks) and I don't have access to the console
port during these outages.
Both yesterday and again today, my server (a Raq3 lightly loaded ...
hosting about 6-10 domains) has decided to drop off the net for 2-6 hours
at a time. Both days this outage started around 4:30pm. Now, once the
server comes back, uptime reports the server has not been rebooted, and the
logs and everything seem to back that up. From looking at the maillog and
syslog, it appears that when this happens, the only traffic I'm seeing is
the active monitor traffic .. I'm getting the normal polls every 15
minutes, but not a thing is coming in from the local ethernet.
Thinking the worst, I've rerun chkrootkit after each failure, as well as
their normally scheduled runs, but nothing is turning up there. I've also
got PortSentry, Logcheck, and IPChains all running on this box, and I've
seen no trips on the security front. Without having physical console access
during the time of this outage, what can I go back and check to try and get
a clue as to what's going on? As a reference point, this outage is only
affecting this one server, and my other servers at CobaltRacks are running
fine, so it doesn't appear to be network related, unless it's the specific
ethernet port that I'm connected to.
Any ideas or pointers?
Charlie