[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [cobalt-users] web.cache VERRY big



Brett,
> While I know this is a bit difficult to do when servers are
> colocated,

I'm not sure how colocation figures into the picture but I do not co-locate.
I own the raq's and other servers. I also own the building they are in.
Well, technically, the bank owns the building, but hopefully I'll own it in
30 years.

> and the time the server crashes also enters into the
> equation in a very big way but might I suggest you simply log onto
> the shell and start watching top a few moments before the server
> crashes?

Sorry, I didn't include that part. 15 minutes before the server crashes I
login as admin, type su and the server responds with 'segmentation fault'.
Memory? I can't put in more then 512MB.

> I've fixed a number of similar situations here simply by
> swaping times that cron runs things on a server. The trick is in
> finding out what is bogging the server down.

Sorry I didn't include that part. I've tried running logrotate at various
times during the day, by itself, with the same results. I've tried running
logrotate by itself with half the files removed from /etc/logrotate.d. It is
a busy server so I also tried deleting access in the middle of the day to
reduce its size.

> If you are sure it's
> the result of the log rotation

I wouldn't classify myself as an expert in log rotation so I can't say with
absolute certainty is has something to do with log rotation. What I can say
is that everytime I type /etc/cron.daily/logrotate , the server crashes
after about 3.5 hours. So, my limited experience makes me believe it has
something to do with log rotation. To be honest with you, I have never gone
through every file in /etc/logrotate.d to make sure Cobalt didn't throw
something in there that had nothing to do with log rotation.

> then simply start moving other cron
> jobs to less heavily loaded times. Our experience here is that a few
> hours of just watching top very often provides all the clues one
> needs to get things back under control.

Sorry, I must have left the most important part out. I don't have the luxury
of a 'few hours' to sit around and watch top play. Particularly on live
servers that take 20 minutes to re-boot when it crashes. But while I did
watch top play, the server crashes while running ParseReport web, if this is
any help. Certainly its of no help to me because I'm not rooting through
ParseReport to find out. The stats available through the GUI are updated, as
near as I can tell, except for site99, which is a site that recieves very
little traffic. Of course I didn't go through all 190 sites to make sure the
stats were updated on all of them.

>
> Peace be with you,
>
Peace be with you.

Rich