[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [cobalt-users] Rebooting the RaQ



> Hello, as per page 59 of the Cobalt RaQ user manual on
> Rebooting a RaQ.
> We have noticed that a RaQ that is either under heavy load, or
> unresponsive to telnet, http requests, etc.  Cannot be
> rebooted from the
> LCD.  Following Cobalt's instructions in the Manual do not work under
> these circumstances and the only remedy appears to be a hard reboot.
> Obviously this is a last resort, but we do not typically like
> to do hard
> reboots, as we have destroyed RaQ's in the past by doing this.
>
> Has anyone else ever noticed behavior like this?  Is there a
> reason the
> LCD reboot method does not work at certain times?

The LCD panel is driven by forking a process and calling a number of perl
scripts.  If something has gone wrong to the point where no more processes
can be forked, the LCD will have trouble responding.

I suppose that if you really wanted to, you could write a "dead man's
switch" script that runs a single process in a loop.  In that loop it tries
to fork simple processes (like a "sleep 1") every once in a while, and if it
fails some number of times in a row, reboots the system.  (this is a similar
concept to a watchdog reset, but most watchdog resets are hard resets, which
you apparently don't want to do)

However, if a system is that hosed, I'm not sure the reboot command would be
able to execute, much less sync the disks and flush your buffers.  I suspect
a hard reboot may be your only option.

- Lyle