Differences

This shows you the differences between two versions of the page.

Link to this comparison view

monitors:smf2.ksh [2010/08/03 05:56] (current)
Line 1: Line 1:
 +====== smf2.ksh ======
 +
 +^ Author | [[ everett.vernon@gmail.com | Vernon Everett ]] |
 +^ Compatibility | Xymon 4.2 |
 +^ Requirements | Solaris 10 |
 +^ Download | None |
 +^ Last Update | 2010-08-03 |
 +
 +===== Description =====
 +A somewhat more complex smf monitoring script, heavily inspired by Martin Ward's smf.sh on this site.
 +Some of the changes - 
 +  * switched to ksh to work around some loop "​features"​ of sh which I don't like. 
 +  * changed the config to a "one service per line model" with defined colours.
 +  * allow you to monitor a service in any state - some services should be disabled
 +  * added logic for when a service isn't there at all
 +  * added a "​NOTINSTALLED"​ option to monitor that a service should not be installed. ​
 +
 +This script has evolved to the point where it is no longer a drop-in replacement for the Martin'​s smf.sh script, so it has been renamed and posted as a new script.
 +We can debate the merits of this, but I believe in choice. ​
 +
 +Martin'​s script is far simpler to use and configure. ​
 +
 +Mine is more complex, but can do more. 
 +
 +Pick your poison.
 +
 +===== Installation =====
 +=== Client side ===
 +1. Copy smf2.ksh to ~$HOME/​client/​ext
 +
 +2. Edit the ''​client/​etc/​clientlaunch.cfg''​ and insert the following text:
 +
 +  # Service Monitoring
 +  [smf]
 +          ENVFILE $HOBBITCLIENTHOME/​etc/​hobbitclient.cfg
 +          CMD $HOBBITCLIENTHOME/​ext/​smf2.ksh
 +          LOGFILE $HOBBITCLIENTHOME/​logs/​smf.log
 +          INTERVAL 5m
 +
 +=== Server side ===
 +3. Edit the server/​etc/​client-local.cfg file and insert lines similar to this for each client or section:
 +
 +  [myhost]
 +  SVC:​svc:/​network/​ssh:​default online red
 +  SVC:​svc:/​application/​hobbit:​default online
 +  SVC:​svc:/​system/​cron:​default online yellow
 +  SVC:​svc:/​network/​nfs/​server:​default disabled red
 +  SVC:​svc:/​network/​cswsyslog_ng:​default online yellow
 +  SVC:​svc:/​network/​cswpostfix:​default NOTINSTALLED
 +
 +4. See script comments for more options
 +
 +===== Source =====
 +==== smf2.ksh ====
 +
 +<hidden onHidden="​Show Code ⇲" onVisible="​Hide Code ⇱">​
 +<​code>​
 +#​!/​usr/​bin/​ksh
 +
 +# A Hobbit script to examine specific Solaris 10 services.
 +
 +# Author: Martin Ward 19 Feb 2008.
 +# Version: 1.0 - Initial version.
 +# V1.1  Script now takes the list of services to monitor from the server
 +#       via the logfetch file.
 +# V2.0  Updated - Vernon Everett 02 Aug 2010
 +#       - Switched to ksh to avoid annoying variable issues in loops
 +#       - Changed how the client-local file worked to allow for single service per line
 +#       - Allowed you to specify if the service should be enabled or disabled
 +# V2.1  Updated - Vernon Everett 03 Aug 2010
 +#       - Added a NOTINSTALLED option to make sure services shouldn'​t be installed
 +#       - Added logic to cater for services that are supposed to be there, but are not
 +#       - Drove myself to drink getting my head around all thos bloody if statements.
 +#       - If you think you can clean it up and make it better, please do.
 +
 +# SVCS is a list of services to examine the status of. Each name must be
 +# specific enough to make it unique in the output from the 'svcs -a' command.
 +# The services themselves are configured on the Hobbit server in the
 +# ~hobbit/​server/​etc/​client-local.cfg file. The lines can look something like:
 +# SVC:/​network/​ssh:​default online red
 +# SVC:/​site/​tftpd:​default offline yellow
 +# SVC:/​system/​sysidtool:​system
 +# SVC:/​service/​devinition/​any anystatus green
 +# SVC:/​service/​devinition/​any NOTINSTALLED
 +# SVC:/​service/​devinition/​any NOTINSTALLED yellow
 +# One service per line
 +# By adding the green colour, it's listed at the top, but doesn'​t trigger a test fail (Could be useful
 +# for somebody, to ensure the service is highly visible, at the top of the list)
 +
 +# The name of the column in Hobbit
 +COLUMN=smf
 +
 +SVCSCMD=/​usr/​bin/​svcs
 +SVCFILE=/​$BBTMP/​svcs.$$
 +rm $SVCFILE.check >/​dev/​null 2>&1
 +SVCLIST=$SVCFILE.list
 +
 +# Get a list of things to check for
 +grep "​^SVC:"​ $BBTMP/​logfetch.$MACHINEDOTS.cfg | sed "​s/​^SVC://​g"​ >> $SVCFILE.check
 +$SVCSCMD -aH > $SVCLIST
 +# Make sure it's empty.
 +echo " " > $SVCFILE.out
 +
 +# Set up the initial colour
 +COLOUR=green
 +
 +# Check if we have services to keep tabs on
 +# If not, drop through, and just report a full list of services. Same as svcs -a
 +if [ -a ${SVCFILE}.check ]
 +then
 +   while read SVCID EXPSTATE FCOLOUR ​ # Service, expected state and colour
 +   do
 +      LCOLOUR=green ​                  # Set the line colour
 +      SVCLINE=$(grep $SVCID $SVCLIST) # Now find the service and start checking it
 +
 +      if [ -z "​$SVCLINE"​ ]            # Oops! We are looking for something and can't find it.
 +      then
 +         if [ "​$EXPSTATE"​ != "​NOTINSTALLED"​ ] # OK, it should be there.
 +         then
 +            # This is bad
 +            if [ -n "​$FCOLOUR"​ ]
 +            then
 +               # A colour was defined for this
 +               ​LCOLOUR="​$FCOLOUR"​
 +               [ "​$LCOLOUR"​ != "​green"​ -a "​$COLOUR"​ != "​red"​ ] && COLOUR="​$LCOLOUR"​
 +            else
 +               # No colour defined, assume "not there" is really bad.
 +               ​LCOLOUR="​red"​
 +               ​COLOUR="​red"​
 +            fi
 +         fi
 +         ​SVCLINE="​NOT FOUND               ​$SVCID"​ # Couldn'​t find it. Set the line to show what we were looking for
 +      else
 +         echo "​$SVCLINE"​ | while read STATE TIME SVCS # We found it. That might be good.
 +         do
 +            if [ "​$EXPSTATE"​ = "​NOTINSTALLED"​ ] # Unless it shouldn'​t be there
 +            then
 +               if [ -n "​$SVCLINE"​ ]
 +               then
 +                  # It's there and it shouldn'​t be. That's bad.
 +                  if [ -n "​$FCOLOUR"​ ]
 +                  then
 +                     # A colour was defined for this
 +                     ​LCOLOUR=$FCOLOUR
 +                     [ "​$LCOLOUR"​ != "​green"​ -a "​$COLOUR"​ != "​red"​ ] && COLOUR=$LCOLOUR
 +                  else
 +                     # No colour defined. Figure one out.
 +                     if [ "​$STATE"​ = "​online"​ -o "​$STATE"​ = "​legacy_run"​ ]
 +                     then
 +                        LCOLOUR="​red" ​     # It's also running. Really bad.
 +                        COLOUR="​red"​
 +                     else
 +                        LCOLOUR="​yellow" ​    # It's there, but not running. Not so bad, but bad enough
 +                        [ "​$COLOUR"​ != "​red"​ ] && COLOUR="​yellow"​
 +                     fi
 +                  fi
 +               fi
 +            else
 +               if [ "​$FCOLOUR"​ != "​green"​ ] # We can set the fail colour to green, so we see it at the top, but don't really want
 +                                            # it to trigger an alert. Kinda handy for keeping it easily visible.
 +               then
 +                  if [ -z "​$EXPSTATE"​ ]
 +                  then
 +                  # We never defined an expected state. Assume it should be up or legacy_run
 +                  # and anything else is bad - or at least slightly bad
 +                     case "​${STATE}"​ in
 +                        '​uninitialized'​|'​offline'​|'​degraded'​)
 +                           ​LCOLOUR="​yellow"​
 +                           if [ "​${COLOUR}"​ != "​red"​ ]
 +                           then
 +                              COLOUR="​yello"​
 +                           fi
 +                           ;;
 +                       '​maintenance'​|'​disabled'​)
 +                           ​LCOLOUR="​red"​
 +                           ​COLOUR="​red"​
 +                           ;;
 +                       '​online'​|'​legacy_run'​)
 +                        LCOLOUR="​green"​
 +                     esac
 +                  else
 +                  # We have defined an expected state, and probably the colour if it fails
 +                     if [ "​$EXPSTATE"​ != "​$STATE"​ ]
 +                     then
 +                        if [ -z "​$FCOLOUR"​ ]  # We didn't set a fail colour, so make it red
 +                        then
 +                           ​LCOLOUR=red
 +                           ​COLOUR=red
 +                        else
 +                           ​LCOLOUR=$FCOLOUR ​   # Otherwise use the defined fail colour
 +                           [ "​${COLOUR}"​ != "​red"​ ]&& COLOUR="​$FCOLOUR"​
 +                        fi
 +                     fi
 +                  fi
 +               fi
 +            fi
 +         done
 +      fi
 +      echo "&​$LCOLOUR $SVCLINE"​ >> $SVCFILE.out
 +   done < ${SVCFILE}.check
 +fi
 +echo >> $SVCFILE.out
 +# Collect a full list of the services.
 +cat $SVCLIST ​ >> $SVCFILE.out
 +
 +# Tell Hobbit about it
 +$BB $BBDISP "​status $MACHINE.$COLUMN $COLOUR `date ; echo ` `cat ${SVCFILE}.out` "
 +# And clean up a little
 +rm -f ${SVCFILE} ${SVCFILE}.out ${SVCFILE}.check ${SVCFILE}.list > /dev/null 2>&1
 +exit 0
 +
 +</​code>​
 +</​hidden>​
 +
 +===== Known  Bugs and Issues =====
 +None known, but let me know if you find any.
 +===== To Do =====
 +Fix any bugs reported to me
 +===== Credits =====
 +As stated above, this script was heavily inspired by the efforts of Martin Ward (see smf.sh) and I am grateful to him for showing me the way.
 +
 +I guess a certain level of dubious gratitude should go to my colleague who kept wanting the script to do more, and making my life a pain. :-)
 +
 +===== Changelog =====
 +
 +  * **2010-08-03**
 +    * Initial release
  
  • monitors/smf2.ksh.txt
  • Last modified: 2010/08/03 05:56
  • (external edit)