monitors:smf2.ksh

Error loading plugin struct
ParseError: syntax error, unexpected 'fn' (T_STRING), expecting :: (T_PAAMAYIM_NEKUDOTAYIM)
More info is available in the error log.
no way to compare when less than two revisions

Differences

This shows you the differences between two versions of the page.


monitors:smf2.ksh [2010/08/03 05:56] (current) – created - external edit 127.0.0.1
Line 1: Line 1:
 +====== smf2.ksh ======
 +
 +^ Author | [[ everett.vernon@gmail.com | Vernon Everett ]] |
 +^ Compatibility | Xymon 4.2 |
 +^ Requirements | Solaris 10 |
 +^ Download | None |
 +^ Last Update | 2010-08-03 |
 +
 +===== Description =====
 +A somewhat more complex smf monitoring script, heavily inspired by Martin Ward's smf.sh on this site.
 +Some of the changes - 
 +  * switched to ksh to work around some loop "features" of sh which I don't like. 
 +  * changed the config to a "one service per line model" with defined colours.
 +  * allow you to monitor a service in any state - some services should be disabled
 +  * added logic for when a service isn't there at all
 +  * added a "NOTINSTALLED" option to monitor that a service should not be installed. 
 +
 +This script has evolved to the point where it is no longer a drop-in replacement for the Martin's smf.sh script, so it has been renamed and posted as a new script.
 +We can debate the merits of this, but I believe in choice. 
 +
 +Martin's script is far simpler to use and configure. 
 +
 +Mine is more complex, but can do more. 
 +
 +Pick your poison.
 +
 +===== Installation =====
 +=== Client side ===
 +1. Copy smf2.ksh to ~$HOME/client/ext
 +
 +2. Edit the ''client/etc/clientlaunch.cfg'' and insert the following text:
 +
 +  # Service Monitoring
 +  [smf]
 +          ENVFILE $HOBBITCLIENTHOME/etc/hobbitclient.cfg
 +          CMD $HOBBITCLIENTHOME/ext/smf2.ksh
 +          LOGFILE $HOBBITCLIENTHOME/logs/smf.log
 +          INTERVAL 5m
 +
 +=== Server side ===
 +3. Edit the server/etc/client-local.cfg file and insert lines similar to this for each client or section:
 +
 +  [myhost]
 +  SVC:svc:/network/ssh:default online red
 +  SVC:svc:/application/hobbit:default online
 +  SVC:svc:/system/cron:default online yellow
 +  SVC:svc:/network/nfs/server:default disabled red
 +  SVC:svc:/network/cswsyslog_ng:default online yellow
 +  SVC:svc:/network/cswpostfix:default NOTINSTALLED
 +
 +4. See script comments for more options
 +
 +===== Source =====
 +==== smf2.ksh ====
 +
 +<hidden onHidden="Show Code ⇲" onVisible="Hide Code ⇱">
 +<code>
 +#!/usr/bin/ksh
 +
 +# A Hobbit script to examine specific Solaris 10 services.
 +
 +# Author: Martin Ward 19 Feb 2008.
 +# Version: 1.0 - Initial version.
 +# V1.1  Script now takes the list of services to monitor from the server
 +#       via the logfetch file.
 +# V2.0  Updated - Vernon Everett 02 Aug 2010
 +#       - Switched to ksh to avoid annoying variable issues in loops
 +#       - Changed how the client-local file worked to allow for single service per line
 +#       - Allowed you to specify if the service should be enabled or disabled
 +# V2.1  Updated - Vernon Everett 03 Aug 2010
 +#       - Added a NOTINSTALLED option to make sure services shouldn't be installed
 +#       - Added logic to cater for services that are supposed to be there, but are not
 +#       - Drove myself to drink getting my head around all thos bloody if statements.
 +#       - If you think you can clean it up and make it better, please do.
 +
 +# SVCS is a list of services to examine the status of. Each name must be
 +# specific enough to make it unique in the output from the 'svcs -a' command.
 +# The services themselves are configured on the Hobbit server in the
 +# ~hobbit/server/etc/client-local.cfg file. The lines can look something like:
 +# SVC:/network/ssh:default online red
 +# SVC:/site/tftpd:default offline yellow
 +# SVC:/system/sysidtool:system
 +# SVC:/service/devinition/any anystatus green
 +# SVC:/service/devinition/any NOTINSTALLED
 +# SVC:/service/devinition/any NOTINSTALLED yellow
 +# One service per line
 +# By adding the green colour, it's listed at the top, but doesn't trigger a test fail (Could be useful
 +# for somebody, to ensure the service is highly visible, at the top of the list)
 +
 +# The name of the column in Hobbit
 +COLUMN=smf
 +
 +SVCSCMD=/usr/bin/svcs
 +SVCFILE=/$BBTMP/svcs.$$
 +rm $SVCFILE.check >/dev/null 2>&1
 +SVCLIST=$SVCFILE.list
 +
 +# Get a list of things to check for
 +grep "^SVC:" $BBTMP/logfetch.$MACHINEDOTS.cfg | sed "s/^SVC://g" >> $SVCFILE.check
 +$SVCSCMD -aH > $SVCLIST
 +# Make sure it's empty.
 +echo " " > $SVCFILE.out
 +
 +# Set up the initial colour
 +COLOUR=green
 +
 +# Check if we have services to keep tabs on
 +# If not, drop through, and just report a full list of services. Same as svcs -a
 +if [ -a ${SVCFILE}.check ]
 +then
 +   while read SVCID EXPSTATE FCOLOUR  # Service, expected state and colour
 +   do
 +      LCOLOUR=green                   # Set the line colour
 +      SVCLINE=$(grep $SVCID $SVCLIST) # Now find the service and start checking it
 +
 +      if [ -z "$SVCLINE" ]            # Oops! We are looking for something and can't find it.
 +      then
 +         if [ "$EXPSTATE" != "NOTINSTALLED" ] # OK, it should be there.
 +         then
 +            # This is bad
 +            if [ -n "$FCOLOUR" ]
 +            then
 +               # A colour was defined for this
 +               LCOLOUR="$FCOLOUR"
 +               [ "$LCOLOUR" != "green" -a "$COLOUR" != "red" ] && COLOUR="$LCOLOUR"
 +            else
 +               # No colour defined, assume "not there" is really bad.
 +               LCOLOUR="red"
 +               COLOUR="red"
 +            fi
 +         fi
 +         SVCLINE="NOT FOUND               $SVCID" # Couldn't find it. Set the line to show what we were looking for
 +      else
 +         echo "$SVCLINE" | while read STATE TIME SVCS # We found it. That might be good.
 +         do
 +            if [ "$EXPSTATE" = "NOTINSTALLED" ] # Unless it shouldn't be there
 +            then
 +               if [ -n "$SVCLINE" ]
 +               then
 +                  # It's there and it shouldn't be. That's bad.
 +                  if [ -n "$FCOLOUR" ]
 +                  then
 +                     # A colour was defined for this
 +                     LCOLOUR=$FCOLOUR
 +                     [ "$LCOLOUR" != "green" -a "$COLOUR" != "red" ] && COLOUR=$LCOLOUR
 +                  else
 +                     # No colour defined. Figure one out.
 +                     if [ "$STATE" = "online" -o "$STATE" = "legacy_run" ]
 +                     then
 +                        LCOLOUR="red"      # It's also running. Really bad.
 +                        COLOUR="red"
 +                     else
 +                        LCOLOUR="yellow"     # It's there, but not running. Not so bad, but bad enough
 +                        [ "$COLOUR" != "red" ] && COLOUR="yellow"
 +                     fi
 +                  fi
 +               fi
 +            else
 +               if [ "$FCOLOUR" != "green" ] # We can set the fail colour to green, so we see it at the top, but don't really want
 +                                            # it to trigger an alert. Kinda handy for keeping it easily visible.
 +               then
 +                  if [ -z "$EXPSTATE" ]
 +                  then
 +                  # We never defined an expected state. Assume it should be up or legacy_run
 +                  # and anything else is bad - or at least slightly bad
 +                     case "${STATE}" in
 +                        'uninitialized'|'offline'|'degraded')
 +                           LCOLOUR="yellow"
 +                           if [ "${COLOUR}" != "red" ]
 +                           then
 +                              COLOUR="yello"
 +                           fi
 +                           ;;
 +                       'maintenance'|'disabled')
 +                           LCOLOUR="red"
 +                           COLOUR="red"
 +                           ;;
 +                       'online'|'legacy_run')
 +                        LCOLOUR="green"
 +                     esac
 +                  else
 +                  # We have defined an expected state, and probably the colour if it fails
 +                     if [ "$EXPSTATE" != "$STATE" ]
 +                     then
 +                        if [ -z "$FCOLOUR" ]  # We didn't set a fail colour, so make it red
 +                        then
 +                           LCOLOUR=red
 +                           COLOUR=red
 +                        else
 +                           LCOLOUR=$FCOLOUR    # Otherwise use the defined fail colour
 +                           [ "${COLOUR}" != "red" ]&& COLOUR="$FCOLOUR"
 +                        fi
 +                     fi
 +                  fi
 +               fi
 +            fi
 +         done
 +      fi
 +      echo "&$LCOLOUR $SVCLINE" >> $SVCFILE.out
 +   done < ${SVCFILE}.check
 +fi
 +echo >> $SVCFILE.out
 +# Collect a full list of the services.
 +cat $SVCLIST  >> $SVCFILE.out
 +
 +# Tell Hobbit about it
 +$BB $BBDISP "status $MACHINE.$COLUMN $COLOUR `date ; echo ` `cat ${SVCFILE}.out` "
 +# And clean up a little
 +rm -f ${SVCFILE} ${SVCFILE}.out ${SVCFILE}.check ${SVCFILE}.list > /dev/null 2>&1
 +exit 0
 +
 +</code>
 +</hidden>
 +
 +===== Known  Bugs and Issues =====
 +None known, but let me know if you find any.
 +===== To Do =====
 +Fix any bugs reported to me
 +===== Credits =====
 +As stated above, this script was heavily inspired by the efforts of Martin Ward (see smf.sh) and I am grateful to him for showing me the way.
 +
 +I guess a certain level of dubious gratitude should go to my colleague who kept wanting the script to do more, and making my life a pain. :-)
 +
 +===== Changelog =====
 +
 +  * **2010-08-03**
 +    * Initial release
  
  • monitors/smf2.ksh.txt
  • Last modified: 2010/08/03 05:56
  • by 127.0.0.1