monitors:smf2.ksh

smf2.ksh

Author Vernon Everett
Compatibility Xymon 4.2
Requirements Solaris 10
Download None
Last Update 2010-08-03

A somewhat more complex smf monitoring script, heavily inspired by Martin Ward's smf.sh on this site. Some of the changes -

  • switched to ksh to work around some loop “features” of sh which I don't like.
  • changed the config to a “one service per line model” with defined colours.
  • allow you to monitor a service in any state - some services should be disabled
  • added logic for when a service isn't there at all
  • added a “NOTINSTALLED” option to monitor that a service should not be installed.

This script has evolved to the point where it is no longer a drop-in replacement for the Martin's smf.sh script, so it has been renamed and posted as a new script. We can debate the merits of this, but I believe in choice.

Martin's script is far simpler to use and configure.

Mine is more complex, but can do more.

Pick your poison.

Client side

1. Copy smf2.ksh to ~$HOME/client/ext

2. Edit the client/etc/clientlaunch.cfg and insert the following text:

# Service Monitoring
[smf]
        ENVFILE $HOBBITCLIENTHOME/etc/hobbitclient.cfg
        CMD $HOBBITCLIENTHOME/ext/smf2.ksh
        LOGFILE $HOBBITCLIENTHOME/logs/smf.log
        INTERVAL 5m

Server side

3. Edit the server/etc/client-local.cfg file and insert lines similar to this for each client or section:

[myhost]
SVC:svc:/network/ssh:default online red
SVC:svc:/application/hobbit:default online
SVC:svc:/system/cron:default online yellow
SVC:svc:/network/nfs/server:default disabled red
SVC:svc:/network/cswsyslog_ng:default online yellow
SVC:svc:/network/cswpostfix:default NOTINSTALLED

4. See script comments for more options

Show Code ⇲

Hide Code ⇱

#!/usr/bin/ksh

# A Hobbit script to examine specific Solaris 10 services.

# Author: Martin Ward 19 Feb 2008.
# Version: 1.0 - Initial version.
# V1.1  Script now takes the list of services to monitor from the server
#       via the logfetch file.
# V2.0  Updated - Vernon Everett 02 Aug 2010
#       - Switched to ksh to avoid annoying variable issues in loops
#       - Changed how the client-local file worked to allow for single service per line
#       - Allowed you to specify if the service should be enabled or disabled
# V2.1  Updated - Vernon Everett 03 Aug 2010
#       - Added a NOTINSTALLED option to make sure services shouldn't be installed
#       - Added logic to cater for services that are supposed to be there, but are not
#       - Drove myself to drink getting my head around all thos bloody if statements.
#       - If you think you can clean it up and make it better, please do.

# SVCS is a list of services to examine the status of. Each name must be
# specific enough to make it unique in the output from the 'svcs -a' command.
# The services themselves are configured on the Hobbit server in the
# ~hobbit/server/etc/client-local.cfg file. The lines can look something like:
# SVC:/network/ssh:default online red
# SVC:/site/tftpd:default offline yellow
# SVC:/system/sysidtool:system
# SVC:/service/devinition/any anystatus green
# SVC:/service/devinition/any NOTINSTALLED
# SVC:/service/devinition/any NOTINSTALLED yellow
# One service per line
# By adding the green colour, it's listed at the top, but doesn't trigger a test fail (Could be useful
# for somebody, to ensure the service is highly visible, at the top of the list)

# The name of the column in Hobbit
COLUMN=smf

SVCSCMD=/usr/bin/svcs
SVCFILE=/$BBTMP/svcs.$$
rm $SVCFILE.check >/dev/null 2>&1
SVCLIST=$SVCFILE.list

# Get a list of things to check for
grep "^SVC:" $BBTMP/logfetch.$MACHINEDOTS.cfg | sed "s/^SVC://g" >> $SVCFILE.check
$SVCSCMD -aH > $SVCLIST
# Make sure it's empty.
echo " " > $SVCFILE.out

# Set up the initial colour
COLOUR=green

# Check if we have services to keep tabs on
# If not, drop through, and just report a full list of services. Same as svcs -a
if [ -a ${SVCFILE}.check ]
then
   while read SVCID EXPSTATE FCOLOUR  # Service, expected state and colour
   do
      LCOLOUR=green                   # Set the line colour
      SVCLINE=$(grep $SVCID $SVCLIST) # Now find the service and start checking it

      if [ -z "$SVCLINE" ]            # Oops! We are looking for something and can't find it.
      then
         if [ "$EXPSTATE" != "NOTINSTALLED" ] # OK, it should be there.
         then
            # This is bad
            if [ -n "$FCOLOUR" ]
            then
               # A colour was defined for this
               LCOLOUR="$FCOLOUR"
               [ "$LCOLOUR" != "green" -a "$COLOUR" != "red" ] && COLOUR="$LCOLOUR"
            else
               # No colour defined, assume "not there" is really bad.
               LCOLOUR="red"
               COLOUR="red"
            fi
         fi
         SVCLINE="NOT FOUND               $SVCID" # Couldn't find it. Set the line to show what we were looking for
      else
         echo "$SVCLINE" | while read STATE TIME SVCS # We found it. That might be good.
         do
            if [ "$EXPSTATE" = "NOTINSTALLED" ] # Unless it shouldn't be there
            then
               if [ -n "$SVCLINE" ]
               then
                  # It's there and it shouldn't be. That's bad.
                  if [ -n "$FCOLOUR" ]
                  then
                     # A colour was defined for this
                     LCOLOUR=$FCOLOUR
                     [ "$LCOLOUR" != "green" -a "$COLOUR" != "red" ] && COLOUR=$LCOLOUR
                  else
                     # No colour defined. Figure one out.
                     if [ "$STATE" = "online" -o "$STATE" = "legacy_run" ]
                     then
                        LCOLOUR="red"      # It's also running. Really bad.
                        COLOUR="red"
                     else
                        LCOLOUR="yellow"     # It's there, but not running. Not so bad, but bad enough
                        [ "$COLOUR" != "red" ] && COLOUR="yellow"
                     fi
                  fi
               fi
            else
               if [ "$FCOLOUR" != "green" ] # We can set the fail colour to green, so we see it at the top, but don't really want
                                            # it to trigger an alert. Kinda handy for keeping it easily visible.
               then
                  if [ -z "$EXPSTATE" ]
                  then
                  # We never defined an expected state. Assume it should be up or legacy_run
                  # and anything else is bad - or at least slightly bad
                     case "${STATE}" in
                        'uninitialized'|'offline'|'degraded')
                           LCOLOUR="yellow"
                           if [ "${COLOUR}" != "red" ]
                           then
                              COLOUR="yello"
                           fi
                           ;;
                       'maintenance'|'disabled')
                           LCOLOUR="red"
                           COLOUR="red"
                           ;;
                       'online'|'legacy_run')
                        LCOLOUR="green"
                     esac
                  else
                  # We have defined an expected state, and probably the colour if it fails
                     if [ "$EXPSTATE" != "$STATE" ]
                     then
                        if [ -z "$FCOLOUR" ]  # We didn't set a fail colour, so make it red
                        then
                           LCOLOUR=red
                           COLOUR=red
                        else
                           LCOLOUR=$FCOLOUR    # Otherwise use the defined fail colour
                           [ "${COLOUR}" != "red" ]&& COLOUR="$FCOLOUR"
                        fi
                     fi
                  fi
               fi
            fi
         done
      fi
      echo "&$LCOLOUR $SVCLINE" >> $SVCFILE.out
   done < ${SVCFILE}.check
fi
echo >> $SVCFILE.out
# Collect a full list of the services.
cat $SVCLIST  >> $SVCFILE.out

# Tell Hobbit about it
$BB $BBDISP "status $MACHINE.$COLUMN $COLOUR `date ; echo ` `cat ${SVCFILE}.out` "
# And clean up a little
rm -f ${SVCFILE} ${SVCFILE}.out ${SVCFILE}.check ${SVCFILE}.list > /dev/null 2>&1
exit 0

None known, but let me know if you find any.

Fix any bugs reported to me

As stated above, this script was heavily inspired by the efforts of Martin Ward (see smf.sh) and I am grateful to him for showing me the way.

I guess a certain level of dubious gratitude should go to my colleague who kept wanting the script to do more, and making my life a pain. :-)

  • 2010-08-03
    • Initial release
  • monitors/smf2.ksh.txt
  • Last modified: 2010/08/03 05:56
  • by 127.0.0.1