monitors:hardware_sensors

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
Next revisionBoth sides next revision
monitors:hardware_sensors [2013/09/27 10:05] – [Hardware monitoring] doctor_madnessmonitors:hardware_sensors [2022/12/11 11:05] – [Source] doktoil_makresh
Line 1: Line 1:
 ====== Hardware monitoring ====== ====== Hardware monitoring ======
  
-^ Author | [[ doctor@makelofine.org | Damien Martins ]] | +^ Author         | [[doctor@makelofine.org| Damien Martins ]]                    
-^ Compatibility | Xymon 4.2.2/4.3.12 | +^ Compatibility  | Xymon 4.2.2/4.3.12                                            
-^ Requirements | sh (or bash), hddtemp, smartmontools | +^ Requirements   | sh (or bash), hddtemp, smartmontools                          
-^ Download | https://www.makelofine.org/xymon-plugins/hobbit-hardware-v0.5.tar.bz2 +^ Download       Part of https://github.com/doktoil-makresh/xymon-plugins.git  
-^ Last Update | 2013-09-27 |+^ Last Update    2022-07-13                                                    |
  
 ===== Description ===== ===== Description =====
Line 11: Line 11:
 ===== Installation ===== ===== Installation =====
 === Client side === === Client side ===
-Untar this package, put hobbit-hardware.sh in $BBHOME/ext directory +Untar this package, put hobbit-hardware.sh in $XYMONCLIENTHOME/ext directory 
-Put hobbit-hardware.conf in $BBHOME/etc directory+Put xymon-hardware.cfg in $XYMONCLIENTHOME/etc directory
 Modify variables in both files to fit your needs/system Modify variables in both files to fit your needs/system
 +User 'xymon' should be allowed to use sudo on some commands (check variables including 'sudo' in xymon-hardware.sh)
 === Server side === === Server side ===
-Add hardware to you $BBHOME/server/bb-hosts line for the host running this script+Add hardware to you $XYMONHOME/server/hosts line for the host running this script
  
-===== Source ===== 
-=== hobbit-hardware.sh === 
-<hidden onHidden="Show Code ⇲" onVisible="Hide Code ⇱"> 
-<code bash> 
 #!/bin/bash #!/bin/bash
  
 # ALL THIS SCRIPT IS UNDER GPL LICENSE # ALL THIS SCRIPT IS UNDER GPL LICENSE
-# Version 0.4 +# Version 0.6 
-# Title:     hobbit-hardware+# Title:     xymon-hardware
 # Author:    Damien Martins  ( doctor |at| makelofine |dot| org) # Author:    Damien Martins  ( doctor |at| makelofine |dot| org)
-# Date:      2013-06-27+# Date:      2018-11-01
 # Purpose:   Check Uni* hardware sensors # Purpose:   Check Uni* hardware sensors
 # Platforms: Uni* having lm-sensor and hddtemp utilities # Platforms: Uni* having lm-sensor and hddtemp utilities
 # Tested:    Xymon 4.3.4 / hddtemp version 0.3-beta15 (Debian Lenny and Etch packages) / sensors version 3.0.2 with libsensors version 3.0.2 (Debian Lenny package) / sensors version 3.0.1 with libsensors version 3.0.1 (Debian Etch package) # Tested:    Xymon 4.3.4 / hddtemp version 0.3-beta15 (Debian Lenny and Etch packages) / sensors version 3.0.2 with libsensors version 3.0.2 (Debian Lenny package) / sensors version 3.0.1 with libsensors version 3.0.1 (Debian Etch package)
    
-#TODO for v0.5 +#TODO for v0.7 
-#       -To be independent of /etc/sensors.conf -> we get raw values, and we set right ones from those, and define thresolds in xymon-hardware.conf file+#       -To be independent of /etc/sensors.conf -> we get raw values, and we set right ones from those, and define thresolds in xymon-hardware.cfg file
 # -Support for multiples sensors # -Support for multiples sensors
 # -Support for independant temperatures thresolds for each disk # -Support for independant temperatures thresolds for each disk
 # #
 # History : # History :
 +# 01 nov 2018 - Steffan ??
 +# v0.5.1 : Adds support for spare drive (not reported as failed anymore)
 +# 27 sep 2013 - Damien Martins
 +# v0.5 : Adds support for HP monitoring tools (hpacucli)
 # 27 jun 2013 - Damien Martins and Xavier Carol i Rosell # 27 jun 2013 - Damien Martins and Xavier Carol i Rosell
-# v0.4 : Fix hddtemp output handling (print last field instead of field N) +# v0.4 : Fixes hddtemp output handling (print last field instead of field N) 
 # 09 sep 2011 - Damien Martins # 09 sep 2011 - Damien Martins
-# v0.3 : Add support for OpenManage Physical disks, temps+# v0.3 : Adds support for OpenManage Physical disks, temps
 # 17 feb 2010 - Damien Martins # 17 feb 2010 - Damien Martins
 # v0.2.2 : Minor code optimizations # v0.2.2 : Minor code optimizations
Line 49: Line 50:
 # v0.2 : -Getting sensor probe no more hard coded # v0.2 : -Getting sensor probe no more hard coded
 # -More verbosity when commands fail # -More verbosity when commands fail
-# -Disk temperature thresolds in xymon-hardware.conf file.+# -Disk temperature thresolds in xymon-hardware.cfg file.
 # -Support smartctl to replace hddtemp (if needed) # -Support smartctl to replace hddtemp (if needed)
 # -Possibility to disable lm-sensors # -Possibility to disable lm-sensors
Line 69: Line 70:
    
 #This script should be stored in ext directory, located in Xymon/Xymon client home (typically ~xymon/client/ext or ~xymon/client/ext). #This script should be stored in ext directory, located in Xymon/Xymon client home (typically ~xymon/client/ext or ~xymon/client/ext).
-#You must configure the xymon-hardware.conf file (or whatever name defined in CONFIG_FILE +#You must configure the xymon-hardware.cfg file (or whatever name defined in CONFIG_FILE)
- +
-#Change to fit your system/wills : +
-TEST="hardware" +
-MSG_FILE="${BBTMP}/xymon-hardware.msg" +
-CONFIG_FILE="${HOBBITCLIENTHOME}/etc/xymon-hardware.conf" +
-TMP_FILE="${BBTMP}/xymon-hardware.tmp" +
-CMD_HDDTEMP="sudo /usr/sbin/hddtemp" +
-SENSORS="/usr/bin/sensors" +
-BC="/usr/bin/bc" +
-SUDO="/usr/bin/sudo" +
-SMARTCTL="/usr/sbin/smartctl" +
-OMREPORT="/opt/dell/srvadmin/sbin/omreport"+
  
 #Debug #Debug
Line 87: Line 76:
  echo "Debug ON"  echo "Debug ON"
         BB=echo         BB=echo
-        HOBBITCLIENTHOME="/usr/local/xymon/client/" +        XYMONCLIENTHOME="/usr/local/Xymon/client/" 
-        BBTMP="$PWD"+        XYMONTMP="$PWD"
         BBDISP=your_xymon_server         BBDISP=your_xymon_server
         MACHINE=$(hostname)         MACHINE=$(hostname)
Line 98: Line 87:
  DATE="/bin/date"  DATE="/bin/date"
  SED="/bin/sed"  SED="/bin/sed"
- CONFIG_FILE="xymon-hardware.conf"+ CONFIG_FILE="xymon-hardware.cfg"
  TMP_FILE="xymon-hardware.tmp"  TMP_FILE="xymon-hardware.tmp"
  MSG_FILE="xymon-hardware.msg"  MSG_FILE="xymon-hardware.msg"
 fi fi
 +
 +#Change to fit your system/wills :
 +TEST="hardware"
 +MSG_FILE="${XYMONTMP}/xymon-hardware.msg"
 +CONFIG_FILE="${XYMONCLIENTHOME}/etc/xymon-hardware.cfg"
 +TMP_FILE="${XYMONTMP}/xymon-hardware.tmp"
 +CMD_HDDTEMP="sudo /usr/sbin/hddtemp"
 +SENSORS="/usr/bin/sensors"
 +BC="/usr/bin/bc"
 +SMARTCTL="sudo /usr/sbin/smartctl"
 +OMREPORT="/opt/dell/srvadmin/sbin/omreport"
 +HPACUCLI="sudo /usr/sbin/hpacucli"
  
 #Don't change anything from here (or assume all responsibility) #Don't change anything from here (or assume all responsibility)
Line 108: Line 109:
  
 #Basic tests : #Basic tests :
-if [ -z "$HOBBITCLIENTHOME" ] ; then +if [ -z "$XYMONCLIENTHOME" ] ; then 
-        echo "HOBBITCLIENTHOME not defined !"+        echo "XYMONCLIENTHOME not defined !"
         exit 1         exit 1
 fi fi
-if [ -z "$BBTMP" ] ; then +if [ -z "$XYMONTMP" ] ; then 
-        echo "BBTMP not defined !"+        echo "XYMONTMP not defined !"
         exit 1         exit 1
 fi fi
Line 139: Line 140:
 DISK_WARNING_TEMP=$($GREP ^DISK_WARNING_TEMP= $CONFIG_FILE | $SED s/^DISK_WARNING_TEMP=//) DISK_WARNING_TEMP=$($GREP ^DISK_WARNING_TEMP= $CONFIG_FILE | $SED s/^DISK_WARNING_TEMP=//)
 DISK_PANIC_TEMP=$($GREP ^DISK_PANIC_TEMP= $CONFIG_FILE | $SED s/^DISK_PANIC_TEMP=//) DISK_PANIC_TEMP=$($GREP ^DISK_PANIC_TEMP= $CONFIG_FILE | $SED s/^DISK_PANIC_TEMP=//)
 +
 +function set_disk_entries_values()
 +{
 +  ENTRIES=$1
 +  if [ "$(echo $ENTRIES | "$AWK" -F, '{print NF}')" -eq 1 ] ; then
 +     LOCAL_DISK_WARNING_TEMP=$DISK_WARNING_TEMP
 +     LOCAL_DISK_PANIC_TEMP=$DISK_PANIC_TEMP
 +  elif [ "$(echo $ENTRIES | "$AWK" -F, '{print NF}')" -eq 2 ] ; then
 +    LOCAL_DISK_WARNING_TEMP=$DISK_WARNING_TEMP
 +    LOCAL_DISK_PANIC_TEMP=$(echo $ENTRIES | "$AWK" -F, '{print $2}')
 +  elif [ "$(echo $ENTRIES | "$AWK" -F, '{print NF}')" -eq 3 ] ; then
 +    LOCAL_DISK_WARNING_TEMP=$(echo $ENTRIES | "$AWK" -F, '{print $2}')
 +    LOCAL_DISK_PANIC_TEMP=$(echo $ENTRIES | "$AWK" -F, '{print $3}')
 +  fi
 +}
  
 function use_hddtemp () function use_hddtemp ()
 { {
-for DISK in $("$GREP" "^DISK=" "$CONFIG_FILE" | "$SED" s/^DISK=//) ; do+  for ENTRIES in $("$GREP" "^DISK=" "$CONFIG_FILE" | "$SED" s/^DISK=// ) ; do 
 +  DISK=$(echo $ENTRIES | "$AWK" -F, '{print $1}'
 + set_disk_entries_values $ENTRIES
  HDD_TEMP="$($CMD_HDDTEMP $DISK | $SED s/..$// | $AWK '{print $NF}')"  HDD_TEMP="$($CMD_HDDTEMP $DISK | $SED s/..$// | $AWK '{print $NF}')"
  if [ ! "$(echo $HDD_TEMP | grep "^[ [:digit:] ]*$")" ] ; then  if [ ! "$(echo $HDD_TEMP | grep "^[ [:digit:] ]*$")" ] ; then
Line 148: Line 166:
  LINE="&red Disk $DISK temperature is UNKNOWN (HDD_TEMP VALUE IS : $HDD_TEMP).  LINE="&red Disk $DISK temperature is UNKNOWN (HDD_TEMP VALUE IS : $HDD_TEMP).
 It seems S.M.A.R.T. is no more responding !!!" It seems S.M.A.R.T. is no more responding !!!"
- echo "La temp�rature de $DISK n'est pas un nombre :/+ echo "La température de $DISK n'est pas un nombre :/
 HDD_TEMP : $HDD_TEMP" HDD_TEMP : $HDD_TEMP"
- elif [ "$HDD_TEMP" -ge "$DISK_PANIC_TEMP" ] ; then+ elif [ "$HDD_TEMP" -ge "$LOCAL_DISK_PANIC_TEMP" ] ; then
  RED=1  RED=1
- LINE="&red Disk temperature is CRITICAL (Panic is $DISK_PANIC_TEMP) :+ LINE="&red Disk temperature is CRITICAL (Panic is $LOCAL_DISK_PANIC_TEMP) :
 "$DISK"_temperature: ${HDD_TEMP}" "$DISK"_temperature: ${HDD_TEMP}"
- elif [ "$HDD_TEMP" -ge "$DISK_WARNING_TEMP" ] ; then+ elif [ "$HDD_TEMP" -ge "$LOCAL_DISK_WARNING_TEMP" ] ; then
  YELLOW="1"  YELLOW="1"
- LINE="&yellow Disk temperature is HIGH (Warning is $DISK_WARNING_TEMP) :+ LINE="&yellow Disk temperature is HIGH (Warning is $LOCAL_DISK_WARNING_TEMP) :
 "$DISK"_temperature: ${HDD_TEMP}" "$DISK"_temperature: ${HDD_TEMP}"
- elif [ "$HDD_TEMP" -lt "$DISK_WARNING_TEMP" ] ; then + elif [ "$HDD_TEMP" -lt "$LOCAL_DISK_WARNING_TEMP" ] ; then 
- LINE="&green Disk temperature is OK (Warning is $DISK_WARNING_TEMP) :+ LINE="&green Disk temperature is OK (Warning is $LOCAL_DISK_WARNING_TEMP) :
 "$DISK"_temperature: ${HDD_TEMP}" "$DISK"_temperature: ${HDD_TEMP}"
  fi  fi
Line 174: Line 192:
  SMARTCTL_ARGS="-A"  SMARTCTL_ARGS="-A"
 fi fi
-for DISK in $("$GREP" "^DISK=" "$CONFIG_FILE" | "$SED" s/^DISK=//) ; do +for ENTRIES in $("$GREP" "^DISK=" "$CONFIG_FILE" | "$SED" s/^DISK=//) ; do 
- HDD_TEMP="$($SUDO $SMARTCTL $SMARTCTL_ARGS $DISK | $GREP "^194" | $AWK '{print $10}')"+ DISK=$(echo $ENTRIES | "$AWK" -F, '{print $1}'
 + set_disk_entries_values $ENTRIES 
 + HDD_TEMP="$($SMARTCTL $SMARTCTL_ARGS $DISK | $GREP "^194" | $AWK '{print $10}')"
         if [ ! "$(echo $HDD_TEMP | grep "^[ [:digit:] ]*$")" ] ; then         if [ ! "$(echo $HDD_TEMP | grep "^[ [:digit:] ]*$")" ] ; then
                 RED=1                 RED=1
                 LINE="&red Disk $DISK temperature is UNKNOWN (HDD_TEMP VALUE IS : $HDD_TEMP).                 LINE="&red Disk $DISK temperature is UNKNOWN (HDD_TEMP VALUE IS : $HDD_TEMP).
 It seems S.M.A.R.T. is no more responding !!!" It seems S.M.A.R.T. is no more responding !!!"
-        echo "La temp�rature de $DISK n'est pas un nombre :/+        echo "La température de $DISK n'est pas un nombre :/
 HDD_TEMP : $HDD_TEMP" HDD_TEMP : $HDD_TEMP"
-        elif [ "$HDD_TEMP" -ge "$DISK_PANIC_TEMP" ] ; then+        elif [ "$HDD_TEMP" -ge "$LOCAL_DISK_PANIC_TEMP" ] ; then
                 RED=1                 RED=1
-                LINE="&red Disk temperature is CRITICAL (Panic is $DISK_PANIC_TEMP) :+                LINE="&red Disk temperature is CRITICAL (Panic is $LOCAL_DISK_PANIC_TEMP) :
 "$DISK"_temperature: ${HDD_TEMP}" "$DISK"_temperature: ${HDD_TEMP}"
-        elif [ "$HDD_TEMP" -ge "$DISK_WARNING_TEMP" ] ; then+        elif [ "$HDD_TEMP" -ge "$LOCAL_DISK_WARNING_TEMP" ] ; then
                 YELLOW="1"                 YELLOW="1"
-                LINE="&yellow Disk temperature is HIGH (Warning is $DISK_WARNING_TEMP) :+                LINE="&yellow Disk temperature is HIGH (Warning is $LOCAL_DISK_WARNING_TEMP) :
 "$DISK"_temperature: ${HDD_TEMP}" "$DISK"_temperature: ${HDD_TEMP}"
-        elif [ "$HDD_TEMP" -lt "$DISK_WARNING_TEMP" ] ; then +        elif [ "$HDD_TEMP" -lt "$LOCAL_DISK_WARNING_TEMP" ] ; then 
-                LINE="&green Disk temperature is OK (Warning is $DISK_WARNING_TEMP) :+                LINE="&green Disk temperature is OK (Warning is $LOCAL_DISK_WARNING_TEMP) :
 "$DISK"_temperature: ${HDD_TEMP}" "$DISK"_temperature: ${HDD_TEMP}"
         fi         fi
Line 225: Line 245:
 unset MIN MAX PANIC VALUE WARNING unset MIN MAX PANIC VALUE WARNING
 } }
 +
 function test_fan () function test_fan ()
 { {
Line 344: Line 365:
 function use_openmanage () function use_openmanage ()
 { {
-rm -f ${BBTMP}/xymon-hardware_volts.tmp ${BBTMP}/xymon-hardware_fans.tmp ${BBTMP}/xymon-hardware_disks.tmp+rm -f ${XYMONTMP}/xymon-hardware_volts.tmp ${XYMONTMP}/xymon-hardware_fans.tmp ${XYMONTMP}/xymon-hardware_disks.tmp
 #Tests temperatures : #Tests temperatures :
  CHASSIS_TEMP=$($OMREPORT chassis temps | grep Reading |awk '{print $3}' | $AWK -F\. '{print $1}')  CHASSIS_TEMP=$($OMREPORT chassis temps | grep Reading |awk '{print $3}' | $AWK -F\. '{print $1}')
Line 356: Line 377:
  CHASSIS_TEMP_STATUS=red  CHASSIS_TEMP_STATUS=red
  echo "&red La temperature du chassis est en ALERTE !!! :  echo "&red La temperature du chassis est en ALERTE !!! :
-temperature_chassis: $CHASSIS_TEMP" >> ${BBTMP}/xymon-hardware.msg+temperature_chassis: $CHASSIS_TEMP" >> $MSG_FILE
  RED=1  RED=1
  elif [ $CHASSIS_TEMP -ge $CHASSIS_TEMP_WARNING ] ; then  elif [ $CHASSIS_TEMP -ge $CHASSIS_TEMP_WARNING ] ; then
Line 362: Line 383:
  YELLOW=1  YELLOW=1
  echo "&yellow La temperature du chassis est en LIMITE-LIMITE !!! :  echo "&yellow La temperature du chassis est en LIMITE-LIMITE !!! :
-temperature_chassis: $CHASSIS_TEMP" >> ${BBTMP}/xymon-hardware.msg+temperature_chassis: $CHASSIS_TEMP" >> $MSG_FILE
  elif [ $CHASSIS_TEMP -lt $CHASSIS_TEMP_WARNING ] ; then  elif [ $CHASSIS_TEMP -lt $CHASSIS_TEMP_WARNING ] ; then
  CHASSIS_TEMP_STATUS=green  CHASSIS_TEMP_STATUS=green
- echo "&green Les voltages sont Ok !" >> ${BBTMP}/xymon-hardware.msg+ echo "&green Les voltages sont Ok !" >> $MSG_FILE
  else  else
  echo "Erreur dans les valeurs de temperatures :  echo "Erreur dans les valeurs de temperatures :
Line 379: Line 400:
  VOLT_GLOBAL_STATUS=green  VOLT_GLOBAL_STATUS=green
  else  else
- $OMREPORT chassis volts | grep -A 2 Index  |grep -v Index | grep -v "\-\-" | cut -c 29- > ${BBTMP}/xymon-hardware_volts.tmp+ $OMREPORT chassis volts | grep -A 2 Index  |grep -v Index | grep -v "\-\-" | cut -c 29- > ${XYMONTMP}/xymon-hardware_volts.tmp
  while read LINE ; do  while read LINE ; do
  echo $LINE | grep -q Status | grep -q Ok  echo $LINE | grep -q Status | grep -q Ok
  if [ $ERROR ] ; then  if [ $ERROR ] ; then
  PROBE_IN_ERROR="$LINE"  PROBE_IN_ERROR="$LINE"
- echo "&yellow Le voltage de $PROBE_IN_ERROR est incorrect !" >> ${BBTMP}/xymon-hardware.msg+ echo "&yellow Le voltage de $PROBE_IN_ERROR est incorrect !" >> $MSG_FILE
  fi  fi
  unset ERROR  unset ERROR
Line 391: Line 412:
  ERROR=1  ERROR=1
  fi  fi
- done < ${BBTMP}/xymon-hardware_volts.tmp+ done < ${XYMONTMP}/xymon-hardware_volts.tmp
  fi  fi
 if [ $VOLT_YELLOW ] ; then if [ $VOLT_YELLOW ] ; then
Line 402: Line 423:
  FANS_GLOBAL_STATUS=green  FANS_GLOBAL_STATUS=green
  else  else
- $OMREPORT chassis fans | grep -A 6 Index  |grep -v Index | grep -v "\-\-" |grep -v "N\/A" | cut -c 29- > ${BBTMP}/xymon-hardware_fans.tmp+ $OMREPORT chassis fans | grep -A 6 Index  |grep -v Index | grep -v "\-\-" |grep -v "N\/A" | cut -c 29- > ${XYMONTMP}/xymon-hardware_fans.tmp
                 while read LINE ; do                 while read LINE ; do
  if [ $NEXT_LINE == FAN_MIN_RPM ] ; then  if [ $NEXT_LINE == FAN_MIN_RPM ] ; then
  FAN_MIN_RPM=$(echo $LINE | awk '{print $1}')  FAN_MIN_RPM=$(echo $LINE | awk '{print $1}')
  echo "&yellow Le ventilateur $FAN_NAME tourne trop lentement ($FAN_RPM inferieur a ${FAN_MIN_RPM}) !!!  echo "&yellow Le ventilateur $FAN_NAME tourne trop lentement ($FAN_RPM inferieur a ${FAN_MIN_RPM}) !!!
-${FAN_NAME}_rpm: $FAN_RPM" >> ${BBTMP}/xymon-hardware_fans.msg+${FAN_NAME}_rpm: $FAN_RPM" >> ${XYMONTMP}/xymon-hardware_fans.msg
  unset NEXT_LINE  unset NEXT_LINE
  fi  fi
Line 419: Line 440:
  if [ $FAN_RPM -le 0 ] ; then  if [ $FAN_RPM -le 0 ] ; then
  FAN_RED=1  FAN_RED=1
- echo "&red Le ventilateur $FAN_NAME ne tourne plus !!!" >> ${BBTMP}/xymon-hardware_fans.msg+ echo "&red Le ventilateur $FAN_NAME ne tourne plus !!!" >> ${XYMONTMP}/xymon-hardware_fans.msg
  fi  fi
                         unset ERROR                         unset ERROR
Line 429: Line 450:
  NEXT_LINE=FAN_NAME  NEXT_LINE=FAN_NAME
                         fi                         fi
-                        done < ${BBTMP}/xymon-hardware_fans.tmp+                        done < ${XYMONTMP}/xymon-hardware_fans.tmp
         fi         fi
 if [ $FAN_RED ] ; then if [ $FAN_RED ] ; then
  RED=1  RED=1
  echo "&red Probleme avec les vitesses des ventilateurs !  echo "&red Probleme avec les vitesses des ventilateurs !
-$(cat ${BBTMP}/xymon-hardware_fans.msg)" >> ${BBTMP}/xymon-hardware.msg+$(cat ${XYMONTMP}/xymon-hardware_fans.msg)" >> $MSG_FILE
 elif [ $FAN_YELLOW ] ; then elif [ $FAN_YELLOW ] ; then
         YELLOW=1         YELLOW=1
  echo "&yellow Probleme avec les vitesses des ventilateurs !  echo "&yellow Probleme avec les vitesses des ventilateurs !
-$(cat ${BBTMP}/xymon-hardware_fans.msg)" >> ${BBTMP}/xymon-hardware.msg+$(cat ${XYMONTMP}/xymon-hardware_fans.msg)" >> $MSG_FILE
 else else
  VOLT_GLOBAL_STATUS=green  VOLT_GLOBAL_STATUS=green
- echo "&green Tout va bien avec les ventilateurs" >> ${BBTMP}/xymon-hardware.msg+ echo "&green Tout va bien avec les ventilateurs" >> $MSG_FILE
 fi fi
  
Line 447: Line 468:
 $OMREPORT storage pdisk controller=0 |grep ^Status | grep -q Ok $OMREPORT storage pdisk controller=0 |grep ^Status | grep -q Ok
 if [ $? -eq 0 ] ; then if [ $? -eq 0 ] ; then
- echo "&green Le statut des disques est Ok !" >> ${BBTMP}/xymon-hardware.msg+ echo "&green Le statut des disques est Ok !" >> $MSG_FILE
 else else
  DISK_COLOR=yellow  DISK_COLOR=yellow
- $OMREPORT storage pdisk controller=0 |grep -A 1 ^Status | grep -v "\-\-" > ${BBTMP}/xymon-hardware_disks.tmp+ $OMREPORT storage pdisk controller=0 |grep -A 1 ^Status | grep -v "\-\-" > ${XYMONTMP}/xymon-hardware_disks.tmp
  while read LINE ; do  while read LINE ; do
  echo $LINE | grep -q Status | grep -q Ok  echo $LINE | grep -q Status | grep -q Ok
  if [ $NEXT_LINE == DISK_NAME ] ; then  if [ $NEXT_LINE == DISK_NAME ] ; then
  DISK_NAME=$(echo $LINE | cut -c 29-)  DISK_NAME=$(echo $LINE | cut -c 29-)
- echo "&yellow Le disque $DISK_NAME est en mauvaise situation !" >> ${BBTMP}/xymon-hardware.msg+ echo "&yellow Le disque $DISK_NAME est en mauvaise situation !" >> $MSG_FILE
  unset NEXT_LINE  unset NEXT_LINE
  fi  fi
Line 464: Line 485:
  NEXT_LINE=DISK_NAME  NEXT_LINE=DISK_NAME
  fi  fi
- done < ${BBTMP}/xymon-hardware_disks.tmp+ done < ${XYMONTMP}/xymon-hardware_disks.tmp
   
 fi fi
 } }
 +function use_hpacucli ()
 +{
 +$HPACUCLI ctrl all show config | grep drive | while read OUTPUT ; do
 +        TYPE=$(echo $OUTPUT | awk '{print $1}' | sed s/drive//)
 +        SLOT=$(echo $OUTPUT | awk '{print $2}')
 +        STATUS=$(echo $OUTPUT | awk '{print $NF}' | sed s/\)//)
 + if [ "$STATUS" == "spare" ] ; then
 +                STATUS=$(echo $OUTPUT | cut -d',' -f4 | sed 's/ //g')
 +        fi
 +        if [ $TYPE == "logical" ] ; then
 +                RAID=$(echo $OUTPUT | awk '{print $6}')
 +                SIZE=$(echo $OUTPUT | awk '{print $3 $4}' | sed s/\(// | sed s/\,//)
 +                if [ "$STATUS" != "OK" ] ; then
 +                        RED=1
 +                        LINE="&red Logical drive $SLOT \(RAID $RAID, size : $SIZE\) status is BAD !!!"
 +                elif [ "$STATUS" == "OK" ] ; then
 +                        LINE="&green Logical drive $SLOT \(RAID $RAID, size : $SIZE\) status is OK"
 +                else
 +                        RED=1
 +                        LINE="&red Unknow status \(or stupid monitoring script\) for logical drive $SLOT \(RAID $RAID, size : $SIZE\) !!!"
 +                fi
 +        elif [ "$TYPE" == "physical" ] ; then
 +                SIZE=$(echo $OUTPUT | awk '{print $8 $9}' | sed s/\,//)
 +                if [ "$STATUS" != "OK" ] ; then
 +                        YELLOW=1
 +                        LINE="&yellow Physical drive in slot $SLOT \(size : $SIZE\) status is BAD !!!"
 +                elif [ "$STATUS" == "OK" ] ; then
 +                        LINE="&green Physical drive in slot $SLOT \(size : $SIZE\) status is OK"
 +                else
 +                        RED=1
 +                        LINE="&red Unknow status \(or stupid monitoring script\) for physical drive in slot $SLOT \(size : $SIZE\) !!!"
 +                fi
 +        fi
 +        echo $LINE >> $MSG_FILE
 +done
 +}
 +
 +$GREP -q ^HPACUCLI=1 $CONFIG_FILE
 +if [ $? -eq 0 ] ; then
 +        use_hpacucli
 +fi
 $GREP -q ^SMARTCTL=1 $CONFIG_FILE $GREP -q ^SMARTCTL=1 $CONFIG_FILE
 if [ $? -eq 0 ] ; then if [ $? -eq 0 ] ; then
Line 480: Line 542:
  use_openmanage  use_openmanage
 fi fi
- 
 $GREP -q ^SENSOR=1 $CONFIG_FILE $GREP -q ^SENSOR=1 $CONFIG_FILE
 if [ $? -eq 0 ] ; then if [ $? -eq 0 ] ; then
Line 493: Line 554:
 fi fi
 "$BB" "$BBDISP" "status "$MACHINE"."$TEST" "$FINAL_STATUS" $("$DATE") "$BB" "$BBDISP" "status "$MACHINE"."$TEST" "$FINAL_STATUS" $("$DATE")
- 
 $("$CAT" "$MSG_FILE") $("$CAT" "$MSG_FILE")
 " "
-</code> 
-</hidden> 
- 
 ===== Known  Bugs and Issues ===== ===== Known  Bugs and Issues =====
 None None
  
 ===== To Do ===== ===== To Do =====
-v0.5+v0.6
   * To be independent of /etc/sensors.conf -> we get raw values, and we set right ones from those, and define thresolds in hobbit-hardware.conf file            * To be independent of /etc/sensors.conf -> we get raw values, and we set right ones from those, and define thresolds in hobbit-hardware.conf file         
   * Support for independant temperatures thresolds for each disk   * Support for independant temperatures thresolds for each disk
Line 532: Line 589:
   * **2013-06-27 v0.4**   * **2013-06-27 v0.4**
     * Fix hddtemp output handling (print last field instead of field N)     * Fix hddtemp output handling (print last field instead of field N)
 +  * **2013-09-27 v0.5**
 +    * Add support for HP monitoring tool (hpacucli)
 +  * **2022-07-13 v0.6**
 +    * Add support for disks independent temperatures
 +
 +
  • monitors/hardware_sensors.txt
  • Last modified: 2022/12/11 11:12
  • by doktoil_makresh