monitors:xymon-smart

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

monitors:xymon-smart [2012/08/30 05:14] (current)
Line 1: Line 1:
 +====== xymon-SMART.sh ======
  
 +^ Author | [[ jlaidman+xymon-smart@rebel-it.com.au | Jeremy Laidman ]] |
 +^ Compatibility | Xymon 4.3.3 |
 +^ Requirements | smarttools, GNU ls, GNU date |
 +^ Download | None |
 +^ Last Update | 2012-08-30 |
 +
 +===== Description =====
 +This script queries the SMART parameters of the drives on a system, and returns the status of those drives as well as reporting various metrics available from the SMART data.
 +
 +The script gets its configuration from the environment or from a configuration file.
 +
 +The script runs in write mode (with a "-w" switch) to create the status file from the output of the smartctl command. Typically this is done every 5 minutes from cron.
 +
 +The script also runs in read mode (with a "-r" switch) to read in the status file and parse it for sending data and status reports to Xymon.  Typically this is done every 5 minutes from a xymonlaunch configuration file (tasks.cfg on a Xymon server, or xymonlaunch.cfg on a Xymon client).
 +
 +In read mode, the script constructs a status report for Xymon to warn if one of the following problems are detected:
 +  * SMART is not enabled on the drive
 +  * SMART self-test is not "OK"
 +  * SMART health status is not "OK"
 + 
 +The script also sends a data report for Xymon to turn into RRD files for graphing.  The data points reported are:
 +  * corrected read errors
 +  * corrected write errors
 +  * uncorrected read errors
 +  * uncorrected write errors
 +  * non-medium errors
 +  * disk temperature
 +
 +{{:monitors:xymon-smart.sh-1.png?200|}}
 +
 +{{:monitors:xymon-smart.sh-2.png?200|}}
 +
 +===== Installation =====
 +=== Client side ===
 +1) Copy the script into a suitable location, such as ''/usr/lib/xymon/client/ext/xymon-SMART.sh''
 +
 +2) Create a crontab entry (eg /etc/cron.d/xymon-SMART.cron) containing this:
 +
 +<code>
 +*/5 * * * * root ( umask 002; XYMONCLIENTHOME=/usr/lib/xymon/client \
 +    CONTROLLER=cciss COUNT=0 DEVICE=cciss/c0d0 \
 +    /path/to/xymon-SMART.sh -w /tmp/SMART.status ) 2>/tmp/SMART.status.err
 +</code>
 +
 +Adjust for your requirements.  Use "cat /proc/partitions" to
 +find a suitable DEVICE value.  Test out values with:
 +
 +  smartctl -d $CONTROLLER,$COUNT -i /dev/$DEVICE
 +
 +For multiple devices, specify a comma-separated list of numbers
 +in the COUNT variable, such as:
 +  ... COUNT=0,1 ...
 +Note: This usage of COUNT is not supported by smartctl.
 +
 +3) Create a Xymon client tasks entry like this:
 +
 +  [smart]
 +       ENVFILE $XYMONCLIENTHOME/etc/xymonclient.cfg
 +       CMD /path/to/xymon-SMART.sh -r /tmp/SMART.status
 +       LOGFILE $XYMONCLIENTLOGS/xymonclient.log
 +       INTERVAL 5m
 +
 +=== Server side ===
 +4) Create entries in graphs.cfg like so:
 +
 +    [smart]
 +        # total read/write errors
 +        TITLE S.M.A.R.T. Total Media Errors
 +        YAXIS errors per second
 +        FNPATTERN ^smart.(.*).rrd
 +        DEF:rc@RRDIDX@=@RRDFN@:err_r_c:AVERAGE
 +        DEF:ru@RRDIDX@=@RRDFN@:err_r_u:AVERAGE
 +        DEF:wc@RRDIDX@=@RRDFN@:err_w_c:AVERAGE
 +        DEF:wu@RRDIDX@=@RRDFN@:err_w_u:AVERAGE
 +        CDEF:re@RRDIDX@=rc@RRDIDX@,ru@RRDIDX@,+
 +        CDEF:we@RRDIDX@=wc@RRDIDX@,wu@RRDIDX@,+
 +        COMMENT:@RRDPARAM@\:\n
 +        LINE1:re@RRDIDX@#@COLOR@:Read Errors         :
 +        GPRINT:re@RRDIDX@:LAST:\: %5.1lf %s (cur)
 +        GPRINT:re@RRDIDX@:MAX: %5.1lf %s (max)
 +        GPRINT:re@RRDIDX@:MIN: %5.1lf %s (min)
 +        GPRINT:re@RRDIDX@:AVERAGE: %5.1lf %s (avg)\n
 +        LINE1:we@RRDIDX@#@COLOR@:Write Errors        :
 +        GPRINT:we@RRDIDX@:LAST:\: %5.1lf %s (cur)
 +        GPRINT:we@RRDIDX@:MAX: %5.1lf %s (max)
 +        GPRINT:we@RRDIDX@:MIN: %5.1lf %s (min)
 +        GPRINT:we@RRDIDX@:AVERAGE: %5.1lf %s (avg)\n
 +    
 +    [smart_temp]
 +        TITLE S.M.A.R.T. Disk Temperature
 +        YAXIS Celcius
 +        FNPATTERN ^smart.(.*).rrd
 +        DEF:temp@RRDIDX@=@RRDFN@:temp:AVERAGE
 +        LINE1:temp@RRDIDX@#@COLOR@:@RRDPARAM@ temperature:
 +        GPRINT:temp@RRDIDX@:LAST:\: %5.1lf°C (cur)
 +        GPRINT:temp@RRDIDX@:MAX: %5.1lf°C (max)
 +        GPRINT:temp@RRDIDX@:MIN: %5.1lf°C (min)
 +        GPRINT:temp@RRDIDX@:AVERAGE: %5.1lf°C (avg)\n
 +    
 +    [smart_nonmedium]
 +        TITLE S.M.A.R.T. Non-Medium Errors
 +        YAXIS errors per second
 +        FNPATTERN ^smart.(.*).rrd
 +        DEF:nmec@RRDIDX@=@RRDFN@:err_nmec:AVERAGE
 +        LINE1:nmec@RRDIDX@#@COLOR@:@RRDPARAM@ non-medium errors:
 +        GPRINT:nmec@RRDIDX@:LAST:\: %5.1lf %s (cur)
 +        GPRINT:nmec@RRDIDX@:MAX: %5.1lf %s (max)
 +        GPRINT:nmec@RRDIDX@:MIN: %5.1lf %s (min)
 +        GPRINT:nmec@RRDIDX@:AVERAGE: %5.1lf %s (avg)\n
 +
 +Add further graph definitions are desired.  The RRD files produce the following DS names:
 +  * err_r_c  = corrected read errors
 +  * err_w_c  = corrected write errors
 +  * err_r_u  = uncorrected read errors
 +  * err_w_u  = uncorrected write errors
 +  * err_nmec = non-medium errors
 +  * temp     = disk temperature
 +
 +5) Add "smart" to the TEST2RRD and GRAPHS variables in xymonserver.cfg, to have the graphs included on the smart status page and the trends page.
 +
 +6) Add "TRENDS:*,smart:smart|smart_temp" to the relevant entries in hosts.cfg, or the "_default_" entry.
 +===== Source =====
 +==== xymon-SMART.sh ====
 +
 +<hidden onHidden="Show Code ⇲" onVisible="Hide Code ⇱">
 +<code>
 +#!/bin/sh
 +
 +# SMART disk monitor
 +# Jeremy Laidman, 2012
 +#
 +# Version 0.5 - August 2012
 +#    - initial public release
 +#
 +# Initially based on Michael Adelmann's "smart" script
 +# (see: http://xymonton.org/monitors:smart), the main
 +# improvements are to support multiple disks, and to
 +# send error counts for graphing.
 +#
 +# This script queries the SMART parameters of the drives
 +# on a system, and returns the status of those drives
 +# as well as reporting various metrics available from
 +# the SMART data.
 +#
 +# How it Works
 +# ------------
 +#
 +# The script gets its configuration from the environment
 +# or from a configuration file.
 +#
 +# The script runs in write mode (with a "-w" switch) to
 +# create the status file from the output of the smartctl
 +# command.  Typically this is done every 5 minutes from cron.
 +#
 +# The script also runs in read mode (with a "-r" switch)
 +# to read in the status file and parse it for sending data
 +# and status reports to Xymon.  Typically this is done
 +# every 5 minutes from a xymonlaunch configuration file
 +# (tasks.cfg on a Xymon server, or xymonlaunch.cfg on
 +# a Xymon client).
 +#
 +# In read mode, the script constructs a status report
 +# for Xymon to warn if one of the following problems are
 +# detected:
 +#     - SMART is not enabled on the drive
 +#     - SMART self-test is not "OK"
 +#     - SMART health status is not "OK"
 +#
 +# The script also sends a data report for Xymon to turn
 +# into RRD files for graphing.  The data points reported
 +# are:
 +#    - corrected read errors
 +#    - corrected write errors
 +#    - uncorrected read errors
 +#    - uncorrected write errors
 +#    - non-medium errors
 +#    - disk temperature
 +#
 +#
 +# To Install
 +# ----------
 +#
 +# Client-side:
 +#
 +# 1) Copy the script into a suitable location,
 +#    such as /usr/lib/xymon/client/ext/xymon-SMART.sh
 +#
 +# 2) Create a crontab entry (eg /etc/cron.d/xymon-SMART.cron) containing this:
 +#
 +#    */5 * * * * root ( umask 002; XYMONCLIENTHOME=/usr/lib/xymon/client \
 +#       CONTROLLER=cciss COUNT=0 DEVICE=cciss/c0d0 \
 +#       /path/to/xymon-SMART.sh -w /tmp/SMART.status ) 2>/tmp/SMART.status.err
 +#
 +#    Adjust for your requirements.  Use "cat /proc/partitions" to
 +#    find a suitable DEVICE value.  Test out values with:
 +#
 +#        smartctl -d $CONTROLLER,$COUNT -i /dev/$DEVICE
 +#
 +#    For multiple devices, specify a comma-separated list of numbers
 +#    in the COUNT variable, such as:
 +#       ... COUNT=0,1 ...
 +#    This usage of COUNT is not supported by smartctl.
 +#
 +# 3) Create a Xymon client tasks entry like this:
 +#
 +#    [smart]
 +#           ENVFILE $XYMONCLIENTHOME/etc/xymonclient.cfg
 +#           CMD /path/to/xymon-SMART.sh -r /tmp/SMART.status
 +#           LOGFILE $XYMONCLIENTLOGS/xymonclient.log
 +#           INTERVAL 5m
 +#
 +# Server-side:
 +#
 +# 4) Create entries in graphs.cfg like so:
 +#
 +#    [smart]
 +#        # total read/write errors
 +#        TITLE S.M.A.R.T. Total Media Errors
 +#        YAXIS errors per second
 +#        FNPATTERN ^smart.(.*).rrd
 +#        DEF:rc@RRDIDX@=@RRDFN@:err_r_c:AVERAGE
 +#        DEF:ru@RRDIDX@=@RRDFN@:err_r_u:AVERAGE
 +#        DEF:wc@RRDIDX@=@RRDFN@:err_w_c:AVERAGE
 +#        DEF:wu@RRDIDX@=@RRDFN@:err_w_u:AVERAGE
 +#        CDEF:re@RRDIDX@=rc@RRDIDX@,ru@RRDIDX@,+
 +#        CDEF:we@RRDIDX@=wc@RRDIDX@,wu@RRDIDX@,+
 +#        COMMENT:@RRDPARAM@\:\n
 +#        LINE1:re@RRDIDX@#@COLOR@:Read Errors         :
 +#        GPRINT:re@RRDIDX@:LAST:\: %5.1lf %s (cur)
 +#        GPRINT:re@RRDIDX@:MAX: %5.1lf %s (max)
 +#        GPRINT:re@RRDIDX@:MIN: %5.1lf %s (min)
 +#        GPRINT:re@RRDIDX@:AVERAGE: %5.1lf %s (avg)\n
 +#        LINE1:we@RRDIDX@#@COLOR@:Write Errors        :
 +#        GPRINT:we@RRDIDX@:LAST:\: %5.1lf %s (cur)
 +#        GPRINT:we@RRDIDX@:MAX: %5.1lf %s (max)
 +#        GPRINT:we@RRDIDX@:MIN: %5.1lf %s (min)
 +#        GPRINT:we@RRDIDX@:AVERAGE: %5.1lf %s (avg)\n
 +#
 +#    [smart_temp]
 +#        TITLE S.M.A.R.T. Disk Temperature
 +#        YAXIS Celcius
 +#        FNPATTERN ^smart.(.*).rrd
 +#        DEF:temp@RRDIDX@=@RRDFN@:temp:AVERAGE
 +#        LINE1:temp@RRDIDX@#@COLOR@:@RRDPARAM@ temperature:
 +#        GPRINT:temp@RRDIDX@:LAST:\: %5.1lf°C (cur)
 +#        GPRINT:temp@RRDIDX@:MAX: %5.1lf°C (max)
 +#        GPRINT:temp@RRDIDX@:MIN: %5.1lf°C (min)
 +#        GPRINT:temp@RRDIDX@:AVERAGE: %5.1lf°C (avg)\n
 +#
 +#    [smart_nonmedium]
 +#        TITLE S.M.A.R.T. Non-Medium Errors
 +#        YAXIS errors per second
 +#        FNPATTERN ^smart.(.*).rrd
 +#        DEF:nmec@RRDIDX@=@RRDFN@:err_nmec:AVERAGE
 +#        LINE1:nmec@RRDIDX@#@COLOR@:@RRDPARAM@ non-medium errors:
 +#        GPRINT:nmec@RRDIDX@:LAST:\: %5.1lf %s (cur)
 +#        GPRINT:nmec@RRDIDX@:MAX: %5.1lf %s (max)
 +#        GPRINT:nmec@RRDIDX@:MIN: %5.1lf %s (min)
 +#        GPRINT:nmec@RRDIDX@:AVERAGE: %5.1lf %s (avg)\n
 +#
 +#    Add further graph definitions are desired.
 +#    The RRD files produce the following DS names:
 +#    - err_r_c  = corrected read errors
 +#    - err_w_c  = corrected write errors
 +#    - err_r_u  = uncorrected read errors
 +#    - err_w_u  = uncorrected write errors
 +#    - err_nmec = non-medium errors
 +#    - temp     = disk temperature
 +#
 +# 5) Add "smart" to the TEST2RRD and GRAPHS variables in
 +#    xymonserver.cfg, to have the graphs included on the
 +#    smart status page and the trends page.
 +#
 +# 6) Add "TRENDS:*,smart:smart|smart_temp" to the relevant
 +#    entries in hosts.cfg, or the "_default_" entry.
 +#
 +#
 +# Troubleshooting
 +# ---------------
 +#
 +# * Check the cron output in /tmp/SMART.status.err and look
 +#   for errors that indicate where the problem might be.
 +#
 +# * Check that the file /tmp/SMART.status is being updated.
 +#   If not, ensure that the script is being run by cron.
 +#
 +# * Ensure that the crontab entry is being run.  On some
 +#   systems, simply creating a file in /etc/cron.d/ will
 +#   not tell crond that there has been a change to its
 +#   configuration.  If this appears to be a problem, simply
 +#   touch the directory containing the crontabs, such as
 +#
 +#      sudo touch /var/spool/cron/tabs
 +#
 +# * If the status file appears correct, manually run the
 +#   script in read (-r) mode with debugging and dry-run:
 +#
 +#      xymoncmd /path/to/xymon-SMART.sh -r -d 1 -n /tmp/SMART.status
 +#
 +# * Check the Xymon log files, particularly xymonclient.log,
 +#   xymonlaunch.log and rrd-status.log.
 +#
 +#
 +# A note about compatibility
 +# --------------------------
 +#
 +# This script makes use of features of GNU "ls" and
 +# GNU "date" to determine if a status file is fresh.
 +# This probably won't work on systems that don't have
 +# GNU "ls" and GNU "date" However such a scenario
 +# is unlikely on systems where smartctl is functioning.
 +
 +die() { echo "$@" >&2; exit 1; }
 +
 +VERSION=0.5
 +
 +NL="
 +      # newline character
 +
 +
 +if [ "$DEBUG" ]; then
 +        BB="echo"
 +        [ "$XYMONCLIENTHOME" ] || XYMONCLIENTHOME="/usr/lib/xymon/client"
 +        [ "$BBDISP" ] || BBDISP="0.0.0.0"
 +        [ "$MACHINE" ] || MACHINE="machine"
 +fi
 +
 +[ "$XYMON" ] || XYMON="$BB"
 +[ "$XYMSRV" ] || XYMSRV="$BBDISP"
 +
 +COLOR="clear"
 +COLUMN="smart"
 +CONFIG="${XYMONCLIENTHOME}/etc/smart.conf"
 +MSG="No S.M.A.R.T. device detected."
 +RAID=""
 +RAIDADDR=""
 +SMARTCTL="/usr/sbin/smartctl"
 +SUDO="/usr/bin/sudo"
 +
 +setup_config() {
 +        # read config file
 +        if [ -f $CONFIG ]; then
 +                source $CONFIG
 +        else
 +                [ "$CONTROLLER" -a "$COUNT" -a "$DEVICE" ] ||
 +                        die "Configuration file not found: $CONFIG"
 +        fi
 +
 +        if [ -n "$CONTROLLER" ]; then
 +                RAIDADDR="$CONTROLLER,$COUNT"
 +                RAID="-d $RAIDADDR"
 +                [ 0$DEBUG -gt 1 ] && echo "debug: RAID set to '$RAID'"
 +        fi
 +
 +        [ -b "/dev/$DEVICE" ] || die "Invalid device: /dev/$DEVICE"
 +
 +        RESULT="Device:\n\t$DEVICE\n\nStatus:\n\n"
 +}
 +
 +get_smart_status() {
 +        # we parese the output and set some flags
 +        echo "$@" | while read LINE; do
 +                case $LINE in
 +                        "Device Address:"*)
 +                                COUNTER=`expr 0$COUNTER + 1`
 +                                set - $LINE""
 +                                DEVADDR=$3
 +                                echo "DEVADDR_$COUNTER=$DEVADDR"
 +                                echo "DEVICES=\"\$DEVICES $COUNTER\""
 +                                ;;
 +                        "Self Test returned without error")
 +                                echo "SMART_SELFTEST_$COUNTER=OK"
 +                                ;;
 +                        "SMART Health Status:"*)
 +                                set - $LINE""
 +                                echo "SMART_HEALTH_$COUNTER=$4"
 +                                ;;
 +                        "Device supports SMART and is Enabled")
 +                                set - $LINE""
 +                                echo "SMART_ENABLED_$COUNTER=1"
 +                                echo "SMART_ENABLED=1"
 +                                ;;
 +                esac
 +        done
 +}
 +
 +get_rrd_data() {
 +        # we parse the output and show some numbers
 +        echo "$@" | while read LINE; do
 +                case $LINE in
 +                        "Device Address:"*)
 +                                set - $LINE""
 +                                [ "$FIRST" ] && echo ""
 +                                echo "[smart.$3.rrd]"
 +                                FIRST=1
 +                                [ 0$DEBUG -gt 0 ] && echo "Found device $3" >&2
 +                                ;;
 +                        read:*)
 +                                set - $LINE""
 +                                echo "DS:err_r_c:COUNTER:600:0:U $5"
 +                                echo "DS:err_r_u:COUNTER:600:0:U $8"
 +                                ;;
 +                        write:*)
 +                                set - $LINE""
 +                                echo "DS:err_w_c:COUNTER:600:0:U $5"
 +                                echo "DS:err_w_u:COUNTER:600:0:U $8"
 +                                ;;
 +                        "Non-medium error count:"*)
 +                                set - $LINE""
 +                                echo "DS:err_nmec:COUNTER:600:0:U $4"
 +                                ;;
 +                        "Current Drive Temperature:"*)
 +                                set - $LINE""
 +                                echo "DS:temp:GAUGE:600:U:U $4"
 +                                ;;
 +                esac
 +        done
 +}
 +
 +show_version() {
 +        echo "Version: $VERSION"
 +}
 +
 +show_usage() {
 +        echo "Usage: $0 [-w writefile|-r readfile|-n|-d|-d N|-h|-V]"
 +        show_version;
 +        echo "Specify -w filename (or --write) to write to file (use '-' for STDOUT)"
 +        echo "Specify -r filename (or --read) to read from a file (use '-' for STDIN)"
 +        echo "Specify -d [N] (or --debug [N]) to enable debug mode, optionally with a debug level"
 +        echo "Specify -n (or --dryrun) to stop short of updating Xymon (typically used with -d)"
 +        echo "Typically, run as root: '$0 -w > tmpfile' and then as Xymon: '$0 -r < tmpfile'."
 +        echo "If no switches are given, Xymon must have sudo rights to run the script with no password."
 +}
 +
 +# Handle CLI modifiers
 +while [ "$1" ]; do
 +        case "$1" in
 +                ""            ;;
 +                -d|--debug)     DEBUG=1
 +                                test 0$2 -gt 0 2>/dev/null && { DEBUG=$2; shift; }
 +                                echo "debug: Debug level $DEBUG"
 +                                ;;
 +                -q|--quiet)     QUIET=1
 +                                ;;
 +                -r|--read)      READ=1
 +                                [ 0$DEBUG -gt 0 ] && echo "debug: read mode"
 +                                [ "$2" ] || die "Specify file to read"
 +                                READFILE="$2"
 +                                shift
 +                                if [ -f "$READFILE" ]; then
 +                                        [ -r "$READFILE" ] || die "Unable to read file: $READFILE"
 +                                else
 +                                        [ 0$QUIET -gt 0 ] && exit
 +                                        die "File not found: $READFILE"
 +                                fi
 +                                ;;
 +                -w|--write)
 +                                [ 0$DEBUG -gt 0 ] && echo "debug: write mode"
 +                                [ "$2" ] || die "Specify file to write"
 +                                WRITEFILE="$2"
 +                                shift
 +                                > $WRITEFILE
 +                                for C in `IFS=,; set - ""$COUNT; echo $@`; do
 +                                        COUNT=$C setup_config
 +                                        if [ "$WRITEFILE" = "-" ]; then
 +                                                [ "$RAIDADDR" ] && echo "Device Address: $RAIDADDR"
 +                                                $SMARTCTL /dev/$DEVICE $RAID --all -X
 +                                        else
 +                                                # assume that $SMARTCTL or ">" will output any errors
 +                                                # so we just bail silently with RC=1
 +                                                {
 +                                                        if [ "$RAIDADDR" ]; then
 +                                                                [ -s $WRITEFILE ] && echo ""
 +                                                                echo "Device Address: $RAIDADDR"
 +                                                        fi
 +                                                        $SMARTCTL /dev/$DEVICE $RAID --all -X
 +                                                } >> $WRITEFILE || exit 1
 +                                        fi
 +                                        [ 0$DEBUG -eq 0 -o "$WRITEFILE" = "-" ] || cat $WRITEFILE | sed 's/^/debug: /'
 +                                done
 +                                exit
 +                                ;;
 +                -n|--dryrun)    DRYRUN=1
 +                                ;;
 +                -V|--version)
 +                                show_version
 +                                exit
 +                                ;;
 +                -h|--help)
 +                                show_usage
 +                                exit
 +                                ;;
 +                *)              die "Unexpected parameter: $1"  ;;
 +        esac
 +        shift
 +done
 +
 +if [ 0$READ -gt 0 ]; then
 +        [ 0$DEBUG -gt 0 ] && echo "debug: reading status from file '$READFILE'"
 +        # bail if the file is older than 5 minutes
 +        if [ "$READFILE" = "-" ]; then
 +                FILETIME=`ls -lL --time-style "+%s" </dev/stdin | { read X X X X X B X; echo $B; }`
 +        else
 +                FILETIME=`ls -lL --time-style "+%s" $READFILE | { read X X X X X B X; echo $B; }`
 +        fi
 +        TIMENOW=`date "+%s"`
 +        TIMEDIFF=`expr $TIMENOW - $FILETIME`
 +        [ 0$TIMEDIFF -lt 0 ] && die "Invalid timestamp"
 +        [ 0$TIMEDIFF -gt 600 ] && die "Stale SMART file is $TIMEDIFF seconds old"
 +        if [ "$READFILE" = "-" ]; then
 +                TMP=`cat`
 +        else
 +                TMP=`cat $READFILE`
 +        fi
 +else
 +        TMP=""
 +        for C in `IFS=,; set - ""$COUNT; echo $@`; do
 +                COUNT=$C setup_config
 +                [ "$RAIDADDR" ] && TMP="$TMP${NL}`echo Device Address: $RAIDADDR`"
 +                TMP="$TMP{$NL}`$SUDO $SMARTCTL /dev/$DEVICE $RAID --all -X`"
 +        done
 +fi
 +
 +SMARTSTATUS=`get_smart_status "$TMP"`
 +[ 0$DEBUG -gt 1 ] && echo "$SMARTSTATUS"
 +eval $SMARTSTATUS
 +
 +RRDDATA=`get_rrd_data "$TMP"`
 +
 +[ "$SMART_ENABLED" ] && SMART=1
 +
 +[ "$XYMON" ] || die "Xymon environment is not setup"
 +
 +MSG="$TMP"
 +for DEVINDEX in $DEVICES; do
 +        COLOR="green"
 +
 +        eval DEVNAME=\$DEVADDR_$DEVINDEX
 +        [ 0$DEBUG -gt 0 ] && echo "Checking SMART for $DEVNAME"
 +
 +        eval SMART_ENABLED=\$SMART_ENABLED_$DEVINDEX
 +        if [ "$SMART_ENABLED" ]; then
 +                RESULT="$RESULT\t&green $DEVNAME supports SMART and is enabled\n"
 +        else
 +                COLOR="yellow"
 +                RESULT="$RESULT\t&yellow $DEVNAME does not support SMART or is not enabled\n"
 +        fi
 +
 +        eval SMART_HEALTH=\$SMART_HEALTH_$DEVINDEX
 +        if [ "$SMART_HEALTH" = "OK" ]; then
 +                RESULT="$RESULT\t&green $DEVNAME SMART Health Status: OK\n"
 +        else
 +                COLOR="red"
 +                RESULT="$RESULT\t&red $DEVNAME SMART Health Status: $SMART_HEALTH\n"
 +        fi
 +
 +        SELF=`echo "$TMP" | grep "Self Test returned without error"`
 +        eval SMART_SELFTEST=\$SMART_SELFTEST_$DEVINDEX
 +        if [ "$SMART_SELFTEST" = "OK" ]; then
 +                RESULT="$RESULT\t&green $DEVNAME Self Test returned without error\n"
 +        else
 +                COLOR="red"
 +                RESULT="$RESULT\t&red $DEVNAME Self Test returned with error: $SMART_SELFTEST\n"
 +        fi
 +done
 +
 +MSG=`echo -e "\n$RESULT\n\n$MSG\n"`
 +
 +if [ 0$DEBUG -gt 0 ]; then
 +        echo "Messages to Xymon:"
 +        echo
 +        echo $XYMON $BBDISP "status $MACHINE.$COLUMN $COLOR `date` $MSG"
 +        echo
 +        echo $XYMON $BBDISP "data $MACHINE.trends${NL}$RRDDATA"
 +fi
 +if [ 0$DRYRUN -eq 0 ]; then
 +        $XYMON $BBDISP "status $MACHINE.$COLUMN $COLOR `date` $MSG"
 +        $XYMON $BBDISP "data $MACHINE.trends${NL}$RRDDATA"
 +fi
 +</code>
 +</hidden>
 +
 +===== Known  Bugs and Issues =====
 +
 +===== To Do =====
 +
 +===== Credits =====
 +
 +===== Changelog =====
 +
 +  * **2012-08-30**
 +    * Initial release
  • monitors/xymon-smart.txt
  • Last modified: 2012/08/30 05:14
  • (external edit)