[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [cobalt-users] Overheating




> Are a lot of cron jobs running? when the processor and disks are working
> they generate more heat, may be it is just on the threshold triggering the
> alarm.

The only jobs running at that time are 'swatch -c' and ntpdate.


Beets me then, I'm guessing you've checked the physical location for anything obvious (Air conditioning turned off for a couple of hours during the night, server sitting on top of an old mainframe server that gets really, really hot at that time of the morning)? Just as a fault finding measure, may be you could adapt the following script to check the temperature every few minutes around that time. Maybe the GUI is getting it wrong. In your case I'd put it in crontab by hand to run every 5 minutes or so so you can see a trend.

HTH
Steve


#!/bin/sh

## Check the CPU every 15 minutes, and if its over the threshold value email
##the admin

###   andy@xxxxxxxxxxxxxxxxxxxxx  04/04/2002
###   http://www.raqpak.com/
###   http://ineedlinux.info/

## Edit this file, so that you can set your own maximum temperature and if
##needed a custom email address.
## Save the file into /root/cpuinfo.sh  and then type the following from
##your ssh/telnet session:
## ln -s /root/cpuinfo.sh /etc/cron.quarter-hourly/cpuinfo.sh
## Which will make the script run every 15 minutes.



## Set the variable SAFETEMP to the maximum temp you want before an alarm
##(in oC)
export SAFETEMP=50

## Set who should be emailed when an alarm is generated - Default is usually
##correct
export ALERTME="admin"



####### Dont change things below here, they work fine as they are :) #######

export CURRENTEMP=`/bin/cat /proc/cpuinfo |/bin/grep temperature |/usr/bin/cut -d ":" -f 2 |/usr/bin/cut -d " " -f 2`

if [ `expr $CURRENTEMP \> $SAFETEMP` = 1 ] ; then
/usr/bin/printf "Subject: ** CPU temperature warning **\n\nCPU Temperature on $HOSTNAME (our server!) has exceeded $SAFETEMP oC\n\nTell Steve NOW!\n\nOpen the outer cupboard doors and the inner rack clear plastic door to help the server cool down\n\nIf you know what you are looking at, check the server fan is still spinning\n\nCpu Temperature: $CURRENTEMP oC\n\n\nTemperature check by Andy (andy@xxxxxxxxxxxxxxxxxxxxx)\nhttp://ineedlinux.info/ \nhttp://www.raqpak.com/"; | /usr/sbin/sendmail -t $ALERTME
fi