[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[cobalt-users] Raq3 - The mysterious dying of services - now after backup ftp
- Subject: [cobalt-users] Raq3 - The mysterious dying of services - now after backup ftp
- From: Chae <chae@xxxxxxxxxxxx>
- Date: Fri Oct 4 04:44:57 2002
- List-id: Mailing list for users to share thoughts on Sun Cobalt products. <cobalt-users.list.cobalt.com>
Hi Yah,
The mysterious dying of services on one of our RaQ3's happened again
tonight 2 days since the last event - but it happened while I was working
in front of my screen.
Raqbackup had previously finished doing its thing and I had just checked
the backup set on the backup server before knocking off and sure enough it
hadn't FTP'd everything across again, deleted the set and ran raqbackup.sh
from the command line then started to watch top. The backup script had
completed the cmuexport and GZIP and then I could see root running ftp, as
soon as ftp had finished I jumped into the backup server and checked the
backup set - it was complete; now by the time I looked at the shell screen
again the server had lost all services again...no ssh, ftp, httpd - nothing.
This now seems to tie in with what happened the last time and in fact
nearly every other time prior to that...
I am at a total loss as to why it would be doing this...
I've spoken about this in the past with my colo who suggested it may be
either one of two things...
What I see in my hourly logs at the end of a backup run is something
similar to this:
> Oct 04 10:38:20 xxx.xxx.64.4:20 -> xxx.xxx.64.121:3401 SYN ******S*
> Oct 04 10:38:20 xxx.xxx.64.4:20 -> xxx.xxx.64.121:3402 SYN ******S*
> Oct 04 10:38:22 xxx.xxx.64.4:20 -> xxx.xxx.64.121:3404 SYN ******S*
> Oct 04 10:38:23 xxx.xxx.64.4:20 -> xxx.xxx.64.121:3405 SYN ******S*
> Oct 04 10:38:23 xxx.xxx.64.4:20 -> xxx.xxx.64.121:3406 SYN ******S*
> Oct 04 10:38:24 xxx.xxx.64.4:20 -> xxx.xxx.64.121:3407 SYN ******S*
> Oct 04 10:38:24 xxx.xxx.64.4:20 -> xxx.xxx.64.121:3408 SYN ******S*
> Oct 04 10:38:27 xxx.xxx.64.4:20 -> xxx.xxx.64.121:3409 SYN ******S*
> Oct 04 10:38:28 xxx.xxx.64.4:20 -> xxx.xxx.64.121:3410 SYN ******S*
> Oct 04 10:38:28 xxx.xxx.64.4:20 -> xxx.xxx.64.121:3411 SYN ******S*
> Oct 04 10:38:29 xxx.xxx.64.4:20 -> xxx.xxx.64.121:3412 SYN ******S*
> Oct 04 10:38:34 xxx.xxx.64.4:20 -> xxx.xxx.64.121:3413 SYN ******S*
> Oct 04 10:38:36 xxx.xxx.64.4:20 -> xxx.xxx.64.121:3414 SYN ******S*
> Oct 04 10:39:54 xxx.xxx.64.4:20 -> xxx.xxx.64.121:3427 SYN ******S*
> Oct 04 10:39:55 xxx.xxx.64.4:20 -> xxx.xxx.64.121:3428 SYN ******S*
> Oct 04 10:39:57 xxx.xxx.64.4:20 -> xxx.xxx.64.121:3429 SYN ******S*
> Oct 04 10:39:57 xxx.xxx.64.4:20 -> xxx.xxx.64.121:3430 SYN ******S*
> Oct 04 10:39:57 xxx.xxx.64.4:20 -> xxx.xxx.64.121:3431 SYN ******S*
> Oct 04 10:39:58 xxx.xxx.64.4:20 -> xxx.xxx.64.121:3432 SYN ******S*
> Oct 04 10:39:58 xxx.xxx.64.4:20 -> xxx.xxx.64.121:3433 SYN ******S*
> Oct 04 10:39:59 xxx.xxx.64.4:20 -> xxx.xxx.64.121:3434 SYN ******S*
> Oct 04 10:39:59 xxx.xxx.64.4:20 -> xxx.xxx.64.121:3435 SYN ******S*
> Oct 04 10:39:59 xxx.xxx.64.4:20 -> xxx.xxx.64.121:3436 SYN ******S*
> Oct 04 10:40:01 xxx.xxx.64.4:20 -> xxx.xxx.64.121:3437 SYN ******S*
> Oct 04 10:40:06 xxx.xxx.64.4:20 -> xxx.xxx.64.121:3438 SYN ******S*
> Oct 04 10:40:29 xxx.xxx.64.4:20 -> xxx.xxx.64.121:3446 SYN ******S*
> Oct 04 10:40:30 xxx.xxx.64.4:20 -> xxx.xxx.64.121:3447 SYN ******S*
> Oct 04 10:40:30 xxx.xxx.64.4:20 -> xxx.xxx.64.121:3448 SYN ******S*
> Oct 04 10:40:30 xxx.xxx.64.4:20 -> xxx.xxx.64.121:3449 SYN ******S*
> Oct 04 10:40:30 xxx.xxx.64.4:20 -> xxx.xxx.64.121:3450 SYN ******S*
> Oct 04 10:40:31 xxx.xxx.64.4:20 -> xxx.xxx.64.121:3451 SYN ******S*
> Oct 04 10:40:32 xxx.xxx.64.4:20 -> xxx.xxx.64.121:3452 SYN ******S*
> Oct 04 10:40:47 xxx.xxx.64.4:20 -> xxx.xxx.64.121:3453 SYN ******S*
When there has been a successful FTP of the backup set I'll usually see
several hundred of these lines, I'm assuming it's one for each file
transferred over to the backup server, the first IP being the backup server
and the 2nd being the RAQ3. I automatically know when I haven't had a
successful transfer as there is usually only around 100 lines or there
abouts, it usually craps itself at port 2000 then tries one more time at
the port then no more records. But some days it doesn't use the same ports
as the day before???
My ISP stated...
"...FTP consists of two layers a control and a data layer. You are seeing
the data layer of your own FTP. It also appears since the server is "step
through" UDP ports that the return side is not successful. Are your ftp
events working? If not I suggest using passive FTP or changing firewall rules."
Now I know that the FTP transfers do work some times well 80% of the time -
so can anyone shed some light as to how I can make the raq FTP to the
backup server in passive mode (if possible) - if I can make the
raqbackup.sh script tell the server to ftp in passive mode, or how I can
make my firewall allow for this sort of transfers. Again I'm not sure what
to think because surely if it was a firewall issue it would be consistent
and not allow me to do a successful FTP transfer of the backup set and also
if it was a passive mode issue surely it wouldn't transfer the files over
80% of the time. The fact that the server suddenly locks everything up and
doesn't allow anyone or anything in or out makes me think maybe firewall
and the reboot flushes IP Chains allowing normal functionality again.
Like I mentioned before, this is doing my wee head in and now I can't
guarantee a successful backup and keep my fingers crossed that after each
backup the server doesn't die on me. The funny thing is if I use the GUI's
box standard backup it FTP's across first time everytime and never a lockout...
Look forward to any solutions or work rounds from those of you more
educated and experienced than I am :)
Regards
Chae