We re-purposed an older desktop computer as a Minecraft server. Because it is older hardware and will have the potential to be under heavy load, we want to keep an eye on the system temperatures to ensure it does not overheat. If a computer becomes too hot, there is a possibility the hardware can be permanently damaged leading to function and data loss. In addition, computers run slower under high heat and we would like to avoid that as well.
SolutionWe can solve this problem by periodically checking the system temperatures to ensure they are at acceptable levels. We want to check it often to get a good idea how warm it gets at different times of day but we don't want to constantly check it manually. We can create a Cron job to run a script every few minutes to check the temperatures automatically and alert us if they are getting out of hand. First, let's set up a Cron job to run the script every five minutes.
/etc/crontab*/5 * * * * root /home/ben/tempCheck.sh
That was pretty simple. Let's take a quick look at the script it will be running. Please note, this script is specifically written for my test server. The sensors command will likely have different output for your specific system and the number of hard drives in your system may be different.
/home/ben/tempCheck.shExplanation
#!/bin/bash cpu_min=5 cpu_max=60 vid_min=5 vid_max=80 hdd_min=5 hdd_max=60 cpu0=`sensors | grep -e 'Core 0' | awk '{print $3}' | sed 's/[^0-9\.]//g'` cpu1=`sensors | grep -e 'Core 1' | awk '{print $3}' | sed 's/[^0-9\.]//g'` vid0=`sensors | grep -e 'temp1' | awk '{print $2}' | sed 's/[^0-9\.]//g'` hdd0=`hddtemp /dev/sda | awk '{print $4}' | sed 's/[^0-9\.]//g'` hdd1=`hddtemp /dev/sdb | awk '{print $4}' | sed 's/[^0-9\.]//g'` log_file="/var/log/temps.log" # Create an empty array to store warning messages declare -a messages # Print the temperatures to a log file printf "$(date +'%m/%d/%Y %r') [INFO]: CPU 0: %.1f°C\tCPU 1: %.1f°C\tHDD 0: %.1f°C\tHDD 1: %.1f°C\tVID 0: %.1f°C\n" $cpu0 $cpu1 $hdd0 $hdd1 $vid0 >> $log_file # Check if the CPU is overheating, if so add a message if [ $(echo "$cpu0 > $cpu_max" | bc) -eq 1 ] || [ $(echo "$cpu1 > $cpu_max" | bc) -eq 1 ] then messages[${#messages[@]}]=$( printf "Bad CPU temp\tCPU 0: %.1f°C\tCPU 1: %.1f°C\tCPU Max: %.1f°C" $cpu0 $cpu1 $cpu_max ) fi # Check if the Hard Drives are overheating, if so add a message if [ $(echo "$hdd0 > $hdd_max" | bc) -eq 1 ] || [ $(echo "$hdd1 > $hdd_max" | bc) -eq 1 ] then messages[${#messages[@]}]=$( printf "Bad HDD temp\tHDD 0: %.1f°C\tHDD 1: %.1f°C\tHDD Max: %.1f°C" $hdd0 $hdd1 $hdd_max ) fi # Check if the Video Card is overheating, if so add a message if [ $(echo "$vid0 > $vid_max" | bc) -eq 1 ] then messages[${#messages[@]}]=$( printf "Bad VID temp\tVID 0: %.1f°C\t\t\tVID Max: %.1f°C" $vid0 $vid_max ) fi # If any of the temperatures exceeded their limits, email and add the warning to the log file if [ ${#messages[@]} -gt 0 ] then # Email me about the situation /usr/bin/mail -s "Server Overheating on $(date +'%m/%d/%Y %r')" "<Your Email Here>" << END_MAIL ${messages[@]} END_MAIL # Add each warning to the log file for message in "${messages[@]}" do printf "$(date +'%m/%d/%Y %r') [WARNING]: $message\n" >> $log_file done fi
We know from the Using Cron page that the date and time elements are telling Cron to schedule this job every minute in increments of five (in other words, every five minutes). Then, the second element tells Cron to run the command as root. This is required because the hddtemp utility in Linux requires root privileges. Finally, the last element tells Cron the command we'd like to run.
The bash script is called each time to check the temperature of different sensors in our machine. If the sensors read a value higher than our maximum limits, an email is sent to us warning of the issue. Additionally, regardless of limits, the sensor temperatures are recorded each time the script is run. This way, we can see how warm our system is throughout any given day. Thanks to Cron, we don't have to sit around gathering sensor output all day. We can look at the log file after a day of running and see if the system requires additional cooling. Thanks again Cron!
Let's move on to one more example now.