[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [cobalt-users] help server down (strange)
- Subject: Re: [cobalt-users] help server down (strange)
- From: Greg Hewitt-Long <cobaltusers@xxxxxxxxxxxxxxxxxxx>
- Date: Sun Apr 27 10:12:01 2003
- List-id: Mailing list for users to share thoughts on Sun Cobalt products. <cobalt-users.list.cobalt.com>
>Hi all
>[snip]
Do you have 256Mb of RAM or 128Mb of RAM and 128Mb of swap? How many sites?
We run the RAQs with 512Mb of RAM minimum now - and I'm about to get some 512Mb sticks if I can and increase to 768Mb of RAM initially, with a plan to go to 1024Mb in a month or so (Gerald - got any 512Mb sticks?).
>I tried to enter the admin gui that dit not work
>
>I can ftp, mail and telnet.
>
>So I treid to reboot
>
>By tel net su as root and dit /sbin/shutdown -r now
install SSH from http://www.pkgmaster.com then disable telnet as soon as possible - it's a serious security hole - get rid of it.
>
>Te server did not show me if it did a reboot I got the promt directly
On our RAQ4 this command works:
/sbin/shutdown -r -now
You will get your prompt back - but also see the broadcast message as follows:"
Broadcast message from root (pts/2) Sun Apr 27 10:58:30 2003...
The system is going down for reboot NOW !!
Within 1 minute, your telnet session should terminate or go dead - that's rebooting losing the connection - if it doesn't, or you don't get the broadcast message - you have a problem, perhaps you've been hacked.
Once you reboot, get two telnet/ssh sessions and run one with a top
take a look for processes going nuts and using lots of RAM.
My suggestion is that you have a rogue bot, or someone on the other end of a browser with IUS - or Idiot User Syndrome - it's common! The user of your site isn't getting the response they need in a fast manner, so they hit the back button and hit forward, or the link or button again - they'll often repeat this a hundred times! I've seen it!
IUS browsers labor under the mistaken belief that requesting the same thing again, will result in it coming quicker - instead they have exacerbated the problem, and continue to do so as long as the request hits the same machine (this can be why confusion exists - as large corporate sites using load balancers with a round-robin type algorithm *can* be made to respond by moving onto the next httpd server in the list). If a user learns that they can keep making requests and get their data quickly, they can crash a lesser server with requests, as they have NO CLUE what is on the server end.
Anyway, whether it's a bot or IUS, the problem occurs when your RAQ4 uses it's RAM, then begins to use SWAP space (disk based RAM) - which is MUCH slower than real RAM. The problem gets worse and worse.
We've had both IUS and bots cause this on servers - ipchaining out the bot/user is the first task, then running top and cutting/pasting the output into a spreadsheet which grabbed the process IDs and in our case, we were able to terminate the processes gracefully using a kill -1.
If you can't do that - then simply shut down the httpd processes for 10-15 minutes.
/etc/rc.d/init.d/httpd stop ; sleep 600 ; /etc/rc.d/init.d/httpd start;
To get to the underlying problem, you may need to either reduce the RAM that a process is using - or - add more RAM - or both. An interim solution is to block out the bot/ip hurting your system.
hth
Greg
>
>Can any one help me?
>
>How can I do en new reboot? And a good one?
>
>Or how can I restart the DNS server by telnet?
>
>Please some help!
>
>Maurice
>
>
>
>
>
>
>
>
>
>
>
>_____________________________________
>cobalt-users mailing list
>cobalt-users@xxxxxxxxxxxxxxx
>To subscribe/unsubscribe, or to SEARCH THE ARCHIVES, go to:
>http://list.cobalt.com/mailman/listinfo/cobalt-users
--
http://www.webyourbusiness.com/
Providers of E-Commerce Software &
Web Design Consultancy and Services.
PH: (970) 266-0195 FAX: (970) 266-0158