[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [cobalt-users] Diagnosing network freezes on Raq4?



> > We have a server that has been relatively imperturbable.  It
> > has stopped chatting on the net twice in two days.  All
> > internal stuff seems to continue, as cron continues
> > processing its charges, internally generated email from
> > logcheck is submitted, the active monitor entries appear in
> > the appropriate logs, and mrtg seems to continue logging its
> > various datapoints. We can find no evidence of mischief afoot
> > and there have been no recent upgrades or installs (or even
> > site additions) on this particular box.
> >
> > Any pointers on how to figure out what is making it stop
> > responding on all ports/services?  Also, is there a data
> > corruption risk when rebooting from the front panel or does
> > it shutdown daemons and filesystems properly?  Now if I just
> > had a 700 mile pole to push the button with <g>.

You don't have Interland as your provider, by any chance?
A reboot of the box most of the time corrects it. What time of day are you
experiencing the stopping. If you can pinpoint it, you can re-boot it before
it happens.
I'm saying this because we had the exact same problem. It turned out to be
the provider's hubs locking up every morning at 7:30am. for the longest
time, we re-booted at 7:00am and it seemed to keep the server from going
down. Since Interland could never fix it, we pulled our boxes and brought
them home and host them ourselves.