Differences

This shows you the differences between two versions of the page.

Link to this comparison view

monitors:xymon-smart [2012/08/30 05:14] (current)
Line 1: Line 1:
 +====== xymon-SMART.sh ======
  
 +^ Author | [[ jlaidman+xymon-smart@rebel-it.com.au | Jeremy Laidman ]] |
 +^ Compatibility | Xymon 4.3.3 |
 +^ Requirements | smarttools, GNU ls, GNU date |
 +^ Download | None |
 +^ Last Update | 2012-08-30 |
 +
 +===== Description =====
 +This script queries the SMART parameters of the drives on a system, and returns the status of those drives as well as reporting various metrics available from the SMART data.
 +
 +The script gets its configuration from the environment or from a configuration file.
 +
 +The script runs in write mode (with a "​-w"​ switch) to create the status file from the output of the smartctl command. Typically this is done every 5 minutes from cron.
 +
 +The script also runs in read mode (with a "​-r"​ switch) to read in the status file and parse it for sending data and status reports to Xymon. ​ Typically this is done every 5 minutes from a xymonlaunch configuration file (tasks.cfg on a Xymon server, or xymonlaunch.cfg on a Xymon client).
 +
 +In read mode, the script constructs a status report for Xymon to warn if one of the following problems are detected:
 +  * SMART is not enabled on the drive
 +  * SMART self-test is not "​OK"​
 +  * SMART health status is not "​OK"​
 + 
 +The script also sends a data report for Xymon to turn into RRD files for graphing. ​ The data points reported are:
 +  * corrected read errors
 +  * corrected write errors
 +  * uncorrected read errors
 +  * uncorrected write errors
 +  * non-medium errors
 +  * disk temperature
 +
 +{{:​monitors:​xymon-smart.sh-1.png?​200|}}
 +
 +{{:​monitors:​xymon-smart.sh-2.png?​200|}}
 +
 +===== Installation =====
 +=== Client side ===
 +1) Copy the script into a suitable location, such as ''/​usr/​lib/​xymon/​client/​ext/​xymon-SMART.sh''​
 +
 +2) Create a crontab entry (eg /​etc/​cron.d/​xymon-SMART.cron) containing this:
 +
 +<​code>​
 +*/5 * * * * root ( umask 002; XYMONCLIENTHOME=/​usr/​lib/​xymon/​client \
 +    CONTROLLER=cciss COUNT=0 DEVICE=cciss/​c0d0 \
 +    /​path/​to/​xymon-SMART.sh -w /​tmp/​SMART.status ) 2>/​tmp/​SMART.status.err
 +</​code>​
 +
 +Adjust for your requirements. ​ Use "cat /​proc/​partitions"​ to
 +find a suitable DEVICE value. ​ Test out values with:
 +
 +  smartctl -d $CONTROLLER,​$COUNT -i /​dev/​$DEVICE
 +
 +For multiple devices, specify a comma-separated list of numbers
 +in the COUNT variable, such as:
 +  ... COUNT=0,1 ...
 +Note: This usage of COUNT is not supported by smartctl.
 +
 +3) Create a Xymon client tasks entry like this:
 +
 +  [smart]
 +       ​ENVFILE $XYMONCLIENTHOME/​etc/​xymonclient.cfg
 +       CMD /​path/​to/​xymon-SMART.sh -r /​tmp/​SMART.status
 +       ​LOGFILE $XYMONCLIENTLOGS/​xymonclient.log
 +       ​INTERVAL 5m
 +
 +=== Server side ===
 +4) Create entries in graphs.cfg like so:
 +
 +    [smart]
 +        # total read/write errors
 +        TITLE S.M.A.R.T. Total Media Errors
 +        YAXIS errors per second
 +        FNPATTERN ^smart.(.*).rrd
 +        DEF:​rc@RRDIDX@=@RRDFN@:​err_r_c:​AVERAGE
 +        DEF:​ru@RRDIDX@=@RRDFN@:​err_r_u:​AVERAGE
 +        DEF:​wc@RRDIDX@=@RRDFN@:​err_w_c:​AVERAGE
 +        DEF:​wu@RRDIDX@=@RRDFN@:​err_w_u:​AVERAGE
 +        CDEF:​re@RRDIDX@=rc@RRDIDX@,​ru@RRDIDX@,​+
 +        CDEF:​we@RRDIDX@=wc@RRDIDX@,​wu@RRDIDX@,​+
 +        COMMENT:​@RRDPARAM@\:​\n
 +        LINE1:​re@RRDIDX@#​@COLOR@:​Read Errors ​        :
 +        GPRINT:​re@RRDIDX@:​LAST:​\:​ %5.1lf %s (cur)
 +        GPRINT:​re@RRDIDX@:​MAX:​ %5.1lf %s (max)
 +        GPRINT:​re@RRDIDX@:​MIN:​ %5.1lf %s (min)
 +        GPRINT:​re@RRDIDX@:​AVERAGE:​ %5.1lf %s (avg)\n
 +        LINE1:​we@RRDIDX@#​@COLOR@:​Write Errors ​       :
 +        GPRINT:​we@RRDIDX@:​LAST:​\:​ %5.1lf %s (cur)
 +        GPRINT:​we@RRDIDX@:​MAX:​ %5.1lf %s (max)
 +        GPRINT:​we@RRDIDX@:​MIN:​ %5.1lf %s (min)
 +        GPRINT:​we@RRDIDX@:​AVERAGE:​ %5.1lf %s (avg)\n
 +    ​
 +    [smart_temp]
 +        TITLE S.M.A.R.T. Disk Temperature
 +        YAXIS Celcius
 +        FNPATTERN ^smart.(.*).rrd
 +        DEF:​temp@RRDIDX@=@RRDFN@:​temp:​AVERAGE
 +        LINE1:​temp@RRDIDX@#​@COLOR@:​@RRDPARAM@ temperature:​
 +        GPRINT:​temp@RRDIDX@:​LAST:​\:​ %5.1lf°C (cur)
 +        GPRINT:​temp@RRDIDX@:​MAX:​ %5.1lf°C (max)
 +        GPRINT:​temp@RRDIDX@:​MIN:​ %5.1lf°C (min)
 +        GPRINT:​temp@RRDIDX@:​AVERAGE:​ %5.1lf°C (avg)\n
 +    ​
 +    [smart_nonmedium]
 +        TITLE S.M.A.R.T. Non-Medium Errors
 +        YAXIS errors per second
 +        FNPATTERN ^smart.(.*).rrd
 +        DEF:​nmec@RRDIDX@=@RRDFN@:​err_nmec:​AVERAGE
 +        LINE1:​nmec@RRDIDX@#​@COLOR@:​@RRDPARAM@ non-medium errors:
 +        GPRINT:​nmec@RRDIDX@:​LAST:​\:​ %5.1lf %s (cur)
 +        GPRINT:​nmec@RRDIDX@:​MAX:​ %5.1lf %s (max)
 +        GPRINT:​nmec@RRDIDX@:​MIN:​ %5.1lf %s (min)
 +        GPRINT:​nmec@RRDIDX@:​AVERAGE:​ %5.1lf %s (avg)\n
 +
 +Add further graph definitions are desired. ​ The RRD files produce the following DS names:
 +  * err_r_c ​ = corrected read errors
 +  * err_w_c ​ = corrected write errors
 +  * err_r_u ​ = uncorrected read errors
 +  * err_w_u ​ = uncorrected write errors
 +  * err_nmec = non-medium errors
 +  * temp     = disk temperature
 +
 +5) Add "​smart"​ to the TEST2RRD and GRAPHS variables in xymonserver.cfg,​ to have the graphs included on the smart status page and the trends page.
 +
 +6) Add "​TRENDS:​*,​smart:​smart|smart_temp"​ to the relevant entries in hosts.cfg, or the "​_default_"​ entry.
 +===== Source =====
 +==== xymon-SMART.sh ====
 +
 +<hidden onHidden="​Show Code ⇲" onVisible="​Hide Code ⇱">​
 +<​code>​
 +#!/bin/sh
 +
 +# SMART disk monitor
 +# Jeremy Laidman, 2012
 +#
 +# Version 0.5 - August 2012
 +#    - initial public release
 +#
 +# Initially based on Michael Adelmann'​s "​smart"​ script
 +# (see: http://​xymonton.org/​monitors:​smart),​ the main
 +# improvements are to support multiple disks, and to
 +# send error counts for graphing.
 +#
 +# This script queries the SMART parameters of the drives
 +# on a system, and returns the status of those drives
 +# as well as reporting various metrics available from
 +# the SMART data.
 +#
 +# How it Works
 +# ------------
 +#
 +# The script gets its configuration from the environment
 +# or from a configuration file.
 +#
 +# The script runs in write mode (with a "​-w"​ switch) to
 +# create the status file from the output of the smartctl
 +# command. ​ Typically this is done every 5 minutes from cron.
 +#
 +# The script also runs in read mode (with a "​-r"​ switch)
 +# to read in the status file and parse it for sending data
 +# and status reports to Xymon. ​ Typically this is done
 +# every 5 minutes from a xymonlaunch configuration file
 +# (tasks.cfg on a Xymon server, or xymonlaunch.cfg on
 +# a Xymon client).
 +#
 +# In read mode, the script constructs a status report
 +# for Xymon to warn if one of the following problems are
 +# detected:
 +#     - SMART is not enabled on the drive
 +#     - SMART self-test is not "​OK"​
 +#     - SMART health status is not "​OK"​
 +#
 +# The script also sends a data report for Xymon to turn
 +# into RRD files for graphing. ​ The data points reported
 +# are:
 +#    - corrected read errors
 +#    - corrected write errors
 +#    - uncorrected read errors
 +#    - uncorrected write errors
 +#    - non-medium errors
 +#    - disk temperature
 +#
 +#
 +# To Install
 +# ----------
 +#
 +# Client-side:​
 +#
 +# 1) Copy the script into a suitable location,
 +#    such as /​usr/​lib/​xymon/​client/​ext/​xymon-SMART.sh
 +#
 +# 2) Create a crontab entry (eg /​etc/​cron.d/​xymon-SMART.cron) containing this:
 +#
 +#    */5 * * * * root ( umask 002; XYMONCLIENTHOME=/​usr/​lib/​xymon/​client \
 +#       ​CONTROLLER=cciss COUNT=0 DEVICE=cciss/​c0d0 \
 +#       /​path/​to/​xymon-SMART.sh -w /​tmp/​SMART.status ) 2>/​tmp/​SMART.status.err
 +#
 +#    Adjust for your requirements. ​ Use "cat /​proc/​partitions"​ to
 +#    find a suitable DEVICE value. ​ Test out values with:
 +#
 +#        smartctl -d $CONTROLLER,​$COUNT -i /​dev/​$DEVICE
 +#
 +#    For multiple devices, specify a comma-separated list of numbers
 +#    in the COUNT variable, such as:
 +#       ... COUNT=0,1 ...
 +#    This usage of COUNT is not supported by smartctl.
 +#
 +# 3) Create a Xymon client tasks entry like this:
 +#
 +#    [smart]
 +#           ​ENVFILE $XYMONCLIENTHOME/​etc/​xymonclient.cfg
 +#           CMD /​path/​to/​xymon-SMART.sh -r /​tmp/​SMART.status
 +#           ​LOGFILE $XYMONCLIENTLOGS/​xymonclient.log
 +#           ​INTERVAL 5m
 +#
 +# Server-side:​
 +#
 +# 4) Create entries in graphs.cfg like so:
 +#
 +#    [smart]
 +#        # total read/write errors
 +#        TITLE S.M.A.R.T. Total Media Errors
 +#        YAXIS errors per second
 +#        FNPATTERN ^smart.(.*).rrd
 +#        DEF:​rc@RRDIDX@=@RRDFN@:​err_r_c:​AVERAGE
 +#        DEF:​ru@RRDIDX@=@RRDFN@:​err_r_u:​AVERAGE
 +#        DEF:​wc@RRDIDX@=@RRDFN@:​err_w_c:​AVERAGE
 +#        DEF:​wu@RRDIDX@=@RRDFN@:​err_w_u:​AVERAGE
 +#        CDEF:​re@RRDIDX@=rc@RRDIDX@,​ru@RRDIDX@,​+
 +#        CDEF:​we@RRDIDX@=wc@RRDIDX@,​wu@RRDIDX@,​+
 +#        COMMENT:​@RRDPARAM@\:​\n
 +#        LINE1:​re@RRDIDX@#​@COLOR@:​Read Errors ​        :
 +#        GPRINT:​re@RRDIDX@:​LAST:​\:​ %5.1lf %s (cur)
 +#        GPRINT:​re@RRDIDX@:​MAX:​ %5.1lf %s (max)
 +#        GPRINT:​re@RRDIDX@:​MIN:​ %5.1lf %s (min)
 +#        GPRINT:​re@RRDIDX@:​AVERAGE:​ %5.1lf %s (avg)\n
 +#        LINE1:​we@RRDIDX@#​@COLOR@:​Write Errors ​       :
 +#        GPRINT:​we@RRDIDX@:​LAST:​\:​ %5.1lf %s (cur)
 +#        GPRINT:​we@RRDIDX@:​MAX:​ %5.1lf %s (max)
 +#        GPRINT:​we@RRDIDX@:​MIN:​ %5.1lf %s (min)
 +#        GPRINT:​we@RRDIDX@:​AVERAGE:​ %5.1lf %s (avg)\n
 +#
 +#    [smart_temp]
 +#        TITLE S.M.A.R.T. Disk Temperature
 +#        YAXIS Celcius
 +#        FNPATTERN ^smart.(.*).rrd
 +#        DEF:​temp@RRDIDX@=@RRDFN@:​temp:​AVERAGE
 +#        LINE1:​temp@RRDIDX@#​@COLOR@:​@RRDPARAM@ temperature:​
 +#        GPRINT:​temp@RRDIDX@:​LAST:​\:​ %5.1lf°C (cur)
 +#        GPRINT:​temp@RRDIDX@:​MAX:​ %5.1lf°C (max)
 +#        GPRINT:​temp@RRDIDX@:​MIN:​ %5.1lf°C (min)
 +#        GPRINT:​temp@RRDIDX@:​AVERAGE:​ %5.1lf°C (avg)\n
 +#
 +#    [smart_nonmedium]
 +#        TITLE S.M.A.R.T. Non-Medium Errors
 +#        YAXIS errors per second
 +#        FNPATTERN ^smart.(.*).rrd
 +#        DEF:​nmec@RRDIDX@=@RRDFN@:​err_nmec:​AVERAGE
 +#        LINE1:​nmec@RRDIDX@#​@COLOR@:​@RRDPARAM@ non-medium errors:
 +#        GPRINT:​nmec@RRDIDX@:​LAST:​\:​ %5.1lf %s (cur)
 +#        GPRINT:​nmec@RRDIDX@:​MAX:​ %5.1lf %s (max)
 +#        GPRINT:​nmec@RRDIDX@:​MIN:​ %5.1lf %s (min)
 +#        GPRINT:​nmec@RRDIDX@:​AVERAGE:​ %5.1lf %s (avg)\n
 +#
 +#    Add further graph definitions are desired.
 +#    The RRD files produce the following DS names:
 +#    - err_r_c ​ = corrected read errors
 +#    - err_w_c ​ = corrected write errors
 +#    - err_r_u ​ = uncorrected read errors
 +#    - err_w_u ​ = uncorrected write errors
 +#    - err_nmec = non-medium errors
 +#    - temp     = disk temperature
 +#
 +# 5) Add "​smart"​ to the TEST2RRD and GRAPHS variables in
 +#    xymonserver.cfg,​ to have the graphs included on the
 +#    smart status page and the trends page.
 +#
 +# 6) Add "​TRENDS:​*,​smart:​smart|smart_temp"​ to the relevant
 +#    entries in hosts.cfg, or the "​_default_"​ entry.
 +#
 +#
 +# Troubleshooting
 +# ---------------
 +#
 +# * Check the cron output in /​tmp/​SMART.status.err and look
 +#   for errors that indicate where the problem might be.
 +#
 +# * Check that the file /​tmp/​SMART.status is being updated.
 +#   If not, ensure that the script is being run by cron.
 +#
 +# * Ensure that the crontab entry is being run.  On some
 +#   ​systems,​ simply creating a file in /​etc/​cron.d/​ will
 +#   not tell crond that there has been a change to its
 +#   ​configuration. ​ If this appears to be a problem, simply
 +#   touch the directory containing the crontabs, such as
 +#
 +#      sudo touch /​var/​spool/​cron/​tabs
 +#
 +# * If the status file appears correct, manually run the
 +#   ​script in read (-r) mode with debugging and dry-run:
 +#
 +#      xymoncmd /​path/​to/​xymon-SMART.sh -r -d 1 -n /​tmp/​SMART.status
 +#
 +# * Check the Xymon log files, particularly xymonclient.log,​
 +#   ​xymonlaunch.log and rrd-status.log.
 +#
 +#
 +# A note about compatibility
 +# --------------------------
 +#
 +# This script makes use of features of GNU "​ls"​ and
 +# GNU "​date"​ to determine if a status file is fresh.
 +# This probably won't work on systems that don't have
 +# GNU "​ls"​ and GNU "​date"​. ​ However such a scenario
 +# is unlikely on systems where smartctl is functioning.
 +
 +die() { echo "​$@"​ >&2; exit 1; }
 +
 +VERSION=0.5
 +
 +NL="
 +" ​      # newline character
 +
 +
 +if [ "​$DEBUG"​ ]; then
 +        BB="​echo"​
 +        [ "​$XYMONCLIENTHOME"​ ] || XYMONCLIENTHOME="/​usr/​lib/​xymon/​client"​
 +        [ "​$BBDISP"​ ] || BBDISP="​0.0.0.0"​
 +        [ "​$MACHINE"​ ] || MACHINE="​machine"​
 +fi
 +
 +[ "​$XYMON"​ ] || XYMON="​$BB"​
 +[ "​$XYMSRV"​ ] || XYMSRV="​$BBDISP"​
 +
 +COLOR="​clear"​
 +COLUMN="​smart"​
 +CONFIG="​${XYMONCLIENTHOME}/​etc/​smart.conf"​
 +MSG="​No S.M.A.R.T. device detected."​
 +RAID=""​
 +RAIDADDR=""​
 +SMARTCTL="/​usr/​sbin/​smartctl"​
 +SUDO="/​usr/​bin/​sudo"​
 +
 +setup_config() {
 +        # read config file
 +        if [ -f $CONFIG ]; then
 +                source $CONFIG
 +        else
 +                [ "​$CONTROLLER"​ -a "​$COUNT"​ -a "​$DEVICE"​ ] ||
 +                        die "​Configuration file not found: $CONFIG"​
 +        fi
 +
 +        if [ -n "​$CONTROLLER"​ ]; then
 +                RAIDADDR="​$CONTROLLER,​$COUNT"​
 +                RAID="​-d $RAIDADDR"​
 +                [ 0$DEBUG -gt 1 ] && echo "​debug:​ RAID set to '​$RAID'"​
 +        fi
 +
 +        [ -b "/​dev/​$DEVICE"​ ] || die "​Invalid device: /​dev/​$DEVICE"​
 +
 +        RESULT="​Device:​\n\t$DEVICE\n\nStatus:​\n\n"​
 +}
 +
 +get_smart_status() {
 +        # we parese the output and set some flags
 +        echo "​$@"​ | while read LINE; do
 +                case $LINE in
 +                        "​Device Address:"​*)
 +                                COUNTER=`expr 0$COUNTER + 1`
 +                                set - $LINE""​
 +                                DEVADDR=$3
 +                                echo "​DEVADDR_$COUNTER=$DEVADDR"​
 +                                echo "​DEVICES=\"​\$DEVICES $COUNTER\""​
 +                                ;;
 +                        "Self Test returned without error"​)
 +                                echo "​SMART_SELFTEST_$COUNTER=OK"​
 +                                ;;
 +                        "SMART Health Status:"​*)
 +                                set - $LINE""​
 +                                echo "​SMART_HEALTH_$COUNTER=$4"​
 +                                ;;
 +                        "​Device supports SMART and is Enabled"​)
 +                                set - $LINE""​
 +                                echo "​SMART_ENABLED_$COUNTER=1"​
 +                                echo "​SMART_ENABLED=1"​
 +                                ;;
 +                esac
 +        done
 +}
 +
 +get_rrd_data() {
 +        # we parse the output and show some numbers
 +        echo "​$@"​ | while read LINE; do
 +                case $LINE in
 +                        "​Device Address:"​*)
 +                                set - $LINE""​
 +                                [ "​$FIRST"​ ] && echo ""​
 +                                echo "​[smart.$3.rrd]"​
 +                                FIRST=1
 +                                [ 0$DEBUG -gt 0 ] && echo "Found device $3" >&2
 +                                ;;
 +                        read:*)
 +                                set - $LINE""​
 +                                echo "​DS:​err_r_c:​COUNTER:​600:​0:​U $5"
 +                                echo "​DS:​err_r_u:​COUNTER:​600:​0:​U $8"
 +                                ;;
 +                        write:*)
 +                                set - $LINE""​
 +                                echo "​DS:​err_w_c:​COUNTER:​600:​0:​U $5"
 +                                echo "​DS:​err_w_u:​COUNTER:​600:​0:​U $8"
 +                                ;;
 +                        "​Non-medium error count:"​*)
 +                                set - $LINE""​
 +                                echo "​DS:​err_nmec:​COUNTER:​600:​0:​U $4"
 +                                ;;
 +                        "​Current Drive Temperature:"​*)
 +                                set - $LINE""​
 +                                echo "​DS:​temp:​GAUGE:​600:​U:​U $4"
 +                                ;;
 +                esac
 +        done
 +}
 +
 +show_version() {
 +        echo "​Version:​ $VERSION"​
 +}
 +
 +show_usage() {
 +        echo "​Usage:​ $0 [-w writefile|-r readfile|-n|-d|-d N|-h|-V]"​
 +        show_version;​
 +        echo "​Specify -w filename (or --write) to write to file (use '​-'​ for STDOUT)"​
 +        echo "​Specify -r filename (or --read) to read from a file (use '​-'​ for STDIN)"​
 +        echo "​Specify -d [N] (or --debug [N]) to enable debug mode, optionally with a debug level"
 +        echo "​Specify -n (or --dryrun) to stop short of updating Xymon (typically used with -d)"
 +        echo "​Typically,​ run as root: '$0 -w > tmpfile'​ and then as Xymon: '$0 -r < tmpfile'​."​
 +        echo "If no switches are given, Xymon must have sudo rights to run the script with no password."​
 +}
 +
 +# Handle CLI modifiers
 +while [ "​$1"​ ]; do
 +        case "​$1"​ in
 +                ""​) ​            ;;
 +                -d|--debug) ​    ​DEBUG=1
 +                                test 0$2 -gt 0 2>/​dev/​null && { DEBUG=$2; shift; }
 +                                echo "​debug:​ Debug level $DEBUG"​
 +                                ;;
 +                -q|--quiet) ​    ​QUIET=1
 +                                ;;
 +                -r|--read) ​     READ=1
 +                                [ 0$DEBUG -gt 0 ] && echo "​debug:​ read mode"
 +                                [ "​$2"​ ] || die "​Specify file to read"
 +                                READFILE="​$2"​
 +                                shift
 +                                if [ -f "​$READFILE"​ ]; then
 +                                        [ -r "​$READFILE"​ ] || die "​Unable to read file: $READFILE"​
 +                                else
 +                                        [ 0$QUIET -gt 0 ] && exit
 +                                        die "File not found: $READFILE"​
 +                                fi
 +                                ;;
 +                -w|--write)
 +                                [ 0$DEBUG -gt 0 ] && echo "​debug:​ write mode"
 +                                [ "​$2"​ ] || die "​Specify file to write"
 +                                WRITEFILE="​$2"​
 +                                shift
 +                                > $WRITEFILE
 +                                for C in `IFS=,; set - ""​$COUNT;​ echo $@`; do
 +                                        COUNT=$C setup_config
 +                                        if [ "​$WRITEFILE"​ = "​-"​ ]; then
 +                                                [ "​$RAIDADDR"​ ] && echo "​Device Address: $RAIDADDR"​
 +                                                $SMARTCTL /​dev/​$DEVICE $RAID --all -X
 +                                        else
 +                                                # assume that $SMARTCTL or ">"​ will output any errors
 +                                                # so we just bail silently with RC=1
 +                                                {
 +                                                        if [ "​$RAIDADDR"​ ]; then
 +                                                                [ -s $WRITEFILE ] && echo ""​
 +                                                                echo "​Device Address: $RAIDADDR"​
 +                                                        fi
 +                                                        $SMARTCTL /​dev/​$DEVICE $RAID --all -X
 +                                                } >> $WRITEFILE || exit 1
 +                                        fi
 +                                        [ 0$DEBUG -eq 0 -o "​$WRITEFILE"​ = "​-"​ ] || cat $WRITEFILE | sed '​s/​^/​debug:​ /'
 +                                done
 +                                exit
 +                                ;;
 +                -n|--dryrun) ​   DRYRUN=1
 +                                ;;
 +                -V|--version)
 +                                show_version
 +                                exit
 +                                ;;
 +                -h|--help)
 +                                show_usage
 +                                exit
 +                                ;;
 +                *)              die "​Unexpected parameter: $1" ​ ;;
 +        esac
 +        shift
 +done
 +
 +if [ 0$READ -gt 0 ]; then
 +        [ 0$DEBUG -gt 0 ] && echo "​debug:​ reading status from file '​$READFILE'"​
 +        # bail if the file is older than 5 minutes
 +        if [ "​$READFILE"​ = "​-"​ ]; then
 +                FILETIME=`ls -lL --time-style "​+%s"​ </​dev/​stdin | { read X X X X X B X; echo $B; }`
 +        else
 +                FILETIME=`ls -lL --time-style "​+%s"​ $READFILE | { read X X X X X B X; echo $B; }`
 +        fi
 +        TIMENOW=`date "​+%s"​`
 +        TIMEDIFF=`expr $TIMENOW - $FILETIME`
 +        [ 0$TIMEDIFF -lt 0 ] && die "​Invalid timestamp"​
 +        [ 0$TIMEDIFF -gt 600 ] && die "Stale SMART file is $TIMEDIFF seconds old"
 +        if [ "​$READFILE"​ = "​-"​ ]; then
 +                TMP=`cat`
 +        else
 +                TMP=`cat $READFILE`
 +        fi
 +else
 +        TMP=""​
 +        for C in `IFS=,; set - ""​$COUNT;​ echo $@`; do
 +                COUNT=$C setup_config
 +                [ "​$RAIDADDR"​ ] && TMP="​$TMP${NL}`echo Device Address: $RAIDADDR`"​
 +                TMP="​$TMP{$NL}`$SUDO $SMARTCTL /​dev/​$DEVICE $RAID --all -X`"
 +        done
 +fi
 +
 +SMARTSTATUS=`get_smart_status "​$TMP"​`
 +[ 0$DEBUG -gt 1 ] && echo "​$SMARTSTATUS"​
 +eval $SMARTSTATUS
 +
 +RRDDATA=`get_rrd_data "​$TMP"​`
 +
 +[ "​$SMART_ENABLED"​ ] && SMART=1
 +
 +[ "​$XYMON"​ ] || die "Xymon environment is not setup"
 +
 +MSG="​$TMP"​
 +for DEVINDEX in $DEVICES; do
 +        COLOR="​green"​
 +
 +        eval DEVNAME=\$DEVADDR_$DEVINDEX
 +        [ 0$DEBUG -gt 0 ] && echo "​Checking SMART for $DEVNAME"​
 +
 +        eval SMART_ENABLED=\$SMART_ENABLED_$DEVINDEX
 +        if [ "​$SMART_ENABLED"​ ]; then
 +                RESULT="​$RESULT\t&​green $DEVNAME supports SMART and is enabled\n"​
 +        else
 +                COLOR="​yellow"​
 +                RESULT="​$RESULT\t&​yellow $DEVNAME does not support SMART or is not enabled\n"​
 +        fi
 +
 +        eval SMART_HEALTH=\$SMART_HEALTH_$DEVINDEX
 +        if [ "​$SMART_HEALTH"​ = "​OK"​ ]; then
 +                RESULT="​$RESULT\t&​green $DEVNAME SMART Health Status: OK\n"
 +        else
 +                COLOR="​red"​
 +                RESULT="​$RESULT\t&​red $DEVNAME SMART Health Status: $SMART_HEALTH\n"​
 +        fi
 +
 +        SELF=`echo "​$TMP"​ | grep "Self Test returned without error"​`
 +        eval SMART_SELFTEST=\$SMART_SELFTEST_$DEVINDEX
 +        if [ "​$SMART_SELFTEST"​ = "​OK"​ ]; then
 +                RESULT="​$RESULT\t&​green $DEVNAME Self Test returned without error\n"​
 +        else
 +                COLOR="​red"​
 +                RESULT="​$RESULT\t&​red $DEVNAME Self Test returned with error: $SMART_SELFTEST\n"​
 +        fi
 +done
 +
 +MSG=`echo -e "​\n$RESULT\n\n$MSG\n"​`
 +
 +if [ 0$DEBUG -gt 0 ]; then
 +        echo "​Messages to Xymon:"​
 +        echo
 +        echo $XYMON $BBDISP "​status $MACHINE.$COLUMN $COLOR `date` $MSG"
 +        echo
 +        echo $XYMON $BBDISP "data $MACHINE.trends${NL}$RRDDATA"​
 +fi
 +if [ 0$DRYRUN -eq 0 ]; then
 +        $XYMON $BBDISP "​status $MACHINE.$COLUMN $COLOR `date` $MSG"
 +        $XYMON $BBDISP "data $MACHINE.trends${NL}$RRDDATA"​
 +fi
 +</​code>​
 +</​hidden>​
 +
 +===== Known  Bugs and Issues =====
 +
 +===== To Do =====
 +
 +===== Credits =====
 +
 +===== Changelog =====
 +
 +  * **2012-08-30**
 +    * Initial release
  • monitors/xymon-smart.txt
  • Last modified: 2012/08/30 05:14
  • (external edit)