My Monitor (CHANGEME)

Author Jerald Sheets
Compatibility Xymon 4.2
Requirements Perl, unix
Download library.pl
Last Update 2010-01-25

Stornext check for the MDC servers. This check will do two things. First, it will watch the total number of fsstate processes. If there are 3 or 4, the check will go yellow. If 5 or more, red. in our environment, this usually means that the storage manager piece is hung.

Next, it checks for total tape drives IN USE, FREE, or DELAYED and passes those values in NCV format to the graphing engine.

Client side

Take the library.pl script, and place it into $HOBBITCLIENTHOME/ext In the $HOBBITCLIENTHOME/etc/clientlaunch.cfg place the following at the end of the file:

                                                                 
   # MDC Fsstate Data Collector.  This is a test module to report
   # the current state of the libraries and the fsstate command.
   [fsstate]                                                   
        ENVFILE $HOBBITCLIENTHOME/etc/hobbitclient.cfg        
        CMD $HOBBITCLIENTHOME/ext/library.pl                  
        LOGFILE $HOBBITCLIENTHOME/logs/library.log            
        INTERVAL 3m                                           

Server side

At the end of your $HOBBITSERVERHOME/etc/hobbitserver.cfg file, add the following at the end of your “TEST2RRD” line:

   library=ncv                                                            

Next, add the following lines immediately below the “TEST2RRD” line.

   # This defines the custom graphs specified in the above TEST2RRD section
       NCV_library="Active:GAUGE,Free:GAUGE,Delayed:GAUGE"                     

In the “GRAPHS” line in the same file, place “library” at the end of the other entries.

Finally, define the graphs for these values in the Xymon graphs configuration $HOBBITSERVERHOME/etc/hobbitgraph.cfg like so:

   # MDC Controller Drive Status Graphs                                    
        [library]                                                               
        TITLE Library Drive Utilization                                     
        YAXIS Number of Drives                                              
        DEF:active=library.rrd:Active:AVERAGE                               
        DEF:free=library.rrd:Free:AVERAGE                                   
        DEF:delayed=library.rrd:Delayed:AVERAGE                             
        LINE2:active#00CCCC:Active Drives                                   
        LINE2:free#09801D:Free Drives                        
        LINE2:delayed#FF0000:Delayed Drives                  
        COMMENT:\n                                           
        GPRINT:active:LAST:Active Drives \: %5.1lf%s (cur)   
        GPRINT:active:MAX: \: %5.1lf%s (max)                 
        GPRINT:active:MIN: \: %5.1lf%s (min)                               
        GPRINT:active:AVERAGE: \: %5.1lf%s (avg)\n                         
        GPRINT:free:LAST:Free Drives \: %5.1lf%s (cur)                     
        GPRINT:free:MAX: \: %5.1lf%s (max)                                 
        GPRINT:free:MIN: \: %5.1lf%s (min)                                 
        GPRINT:free:AVERAGE: \: %5.1lf%s (avg)\n                           
        GPRINT:delayed:LAST:Delayed Drives \: %5.1lf%s (cur)               
        GPRINT:delayed:MAX: \: %5.1lf%s (max)                              
        GPRINT:delayed:MIN: \: %5.1lf%s (min)                              
        GPRINT:delayed:AVERAGE: \: %5.1lf%s (avg)\n                        

If all goes well, you will get a library column on your MDC servers. It will display the Active, Free, and Delayed tape drives and will graph each one.

library.pl

Show Code ⇲

Hide Code ⇱

#!/usr/bin/perl -w
#
#
##############################################################################################
#                                                                                            #
# Program:              library.pl                                                           #
# Author:               Jerald Sheets                                                        #
# Initial Version:      0.98                                                                 #
# Changelog:            01/04/10        Initial Release                                      #
#                                                                                            #
############################################################################################## 
#                                                                                            #
# Purpose:  This perl script is meant to run on a pair of StorNext MDC servers.  It          #
#           does a couple things.  First, it checks how many fsstate prcesses are            #
#           running, and sets the status to yellow for 3 or 4 and red for 5 or more.         #
#           Generally speaking, if you have this many fsstate processes running, it          #
#           could be that Stornext is hung, and you need to have a look.  Next, it           #
#           runs fsstate and looks for how many drives are in one of three specific          #
#           states (IN USE, FREE, or DELAYED) and counts those up.  Then, it outputs         #
#           The amount in NCV format for use in Xymon.  Finally, by setting up the           #
#           Xymon server to do so, it will graph the number of tapes in a particular         #
#           state and provide averages over time.                                            #
#                                                                                            #
##############################################################################################
#                                                                                            #
# Deployment:  Take the library.pl script, and place it into $HOBBITCLIENTHOME/ext           #
#              In the $HOBBITCLIENTHOME/etc/clientlaunch.cfg place the following at          #
#              the end of the file:                                                          #
#                                                                                            #
#                  # MDC Fsstate Data Collector.  This is a test module to report            #
#                  # the current state of the libraries and the fsstate command.             #
#                  [fsstate]                                                                 #
#                       ENVFILE $HOBBITCLIENTHOME/etc/hobbitclient.cfg                       #
#                       CMD $HOBBITCLIENTHOME/ext/library.pl                                 #
#                       LOGFILE $HOBBITCLIENTHOME/logs/library.log                           #
#                       INTERVAL 3m                                                          #
#                                                                                            #
#              Once you have done this, do the following on your Xymon Server:               #
#                                                                                            #
#              At the end of your $HOBBITSERVERHOME/etc/hobbitserver.cfg file,               #
#              add the following at the end of your "TEST2RRD" line:                         #
#                                                                                            #
#                   library=ncv                                                              #
#                                                                                            #
#              Next, add the following lines immediately below the "TEST2RRD"                #
#              line.                                                                         #
#                                                                                            #
#                   # This defines the custom graphs specified in the above TEST2RRD section #
#                   NCV_library="Active:GAUGE,Free:GAUGE,Delayed:GAUGE"                      #
#                                                                                            #
#              In the "GRAPHS" line in the same file, place "library" at the end             #
#              of the other entries.                                                         #
#                                                                                            #
#              Finally, define the graphs for these values in the Xymon graphs               #
#              configuration $HOBBITSERVERHOME/etc/hobbitgraph.cfg  like so:                 #
#                                                                                            #
#                   # MDC Controller Drive Status Graphs                                     #
#                   [library]                                                                #
#                       TITLE Library Drive Utilization                                      #
#                       YAXIS Number of Drives                                               #
#                       DEF:active=library.rrd:Active:AVERAGE                                #
#                       DEF:free=library.rrd:Free:AVERAGE                                    #
#                       DEF:delayed=library.rrd:Delayed:AVERAGE                              #
#                       LINE2:active#00CCCC:Active Drives                                    #
#                       LINE2:free#09801D:Free Drives                                        #
#                       LINE2:delayed#FF0000:Delayed Drives                                  #
#                       COMMENT:\n                                                           #
#                       GPRINT:active:LAST:Active Drives \: %5.1lf%s (cur)                   #
#                       GPRINT:active:MAX: \: %5.1lf%s (max)                                 #
#                       GPRINT:active:MIN: \: %5.1lf%s (min)                                 #
#                       GPRINT:active:AVERAGE: \: %5.1lf%s (avg)\n                           #
#                       GPRINT:free:LAST:Free Drives \: %5.1lf%s (cur)                       #
#                       GPRINT:free:MAX: \: %5.1lf%s (max)                                   #
#                       GPRINT:free:MIN: \: %5.1lf%s (min)                                   #
#                       GPRINT:free:AVERAGE: \: %5.1lf%s (avg)\n                             #
#                       GPRINT:delayed:LAST:Delayed Drives \: %5.1lf%s (cur)                 #
#                       GPRINT:delayed:MAX: \: %5.1lf%s (max)                                #
#                       GPRINT:delayed:MIN: \: %5.1lf%s (min)                                #
#                       GPRINT:delayed:AVERAGE: \: %5.1lf%s (avg)\n                          #
#                                                                                            #
#              If all goes well, you will get a library column on your MDC servers.  It      #
#              will display the Active, Free, and Delayed tape drives and will graph each    #
#              one.                                                                          #
#                                                                                            #
##############################################################################################

use strict;

# My variables and commands
my($DEBUG)      = "0";
my($inuse)      = `/usr/adic/TSM/bin/fsstate |grep "IN USE"|wc -l`;
my($free)       = `/usr/adic/TSM/bin/fsstate |grep "FREE"|wc -l`;
my($delayed)    = `/usr/adic/TSM/bin/fsstate |grep "DELAYED"|wc -l`;
my($masterFile) = '/usr/adic/install/.active_snfs_server';
my($isMaster)   = "";

# Xymon Variables
$ENV{BBPROG}    = "library.pl";
my($TESTNAME)   = "library";
my($BBHOME)     = $ENV{BBHOME};
my($BB)         = $ENV{BB};
my($BBDISP)     = $ENV{BBDISP};
my($BBVAR)      = $ENV{BBVAR};
my($MACHINE)    = $ENV{MACHINE};
my($DATE)       = localtime;
my($COLOR)      = "clear";
my($MSG)        = "";
my($HEAD)       = "";
my($DATA)       = "";

# Invoke debug routine if flag is set above
if ($DEBUG == 1){
   $BBHOME      |= "/tmp";
   $BB           = "/bin/echo";
   $BBDISP      |= "127.0.0.1";
   $BBVAR       |= "/tmp";
   $MACHINE     |= "test.host.cvf";
}

# Fsstate Processes - sanity check
my($FSCMD)      = `/bin/ps -ef |grep fsstate |grep -v grep |wc -l`;

if($FSCMD > 5){
   $HEAD = "MDC Fsstate Critical.\n";
   $MSG = "More than 5 fsstate processes detected.  MDC possibly hung.\n";
      sendRed();
      sendReport();
}elsif(($FSCMD == 3) || ($FSCMD == 4)){
   $HEAD = "MDC Fsstate Cautionary.\n";
   $MSG = "Fsstate instances rising.  Now 3 or 4 are active.\n";
      sendYellow();
      sendReport();
}


# Figure out if you're the master server or not
isMaster();

# If you are the master server, set the variable.  If not, update Xymon that "all is clear"
if($isMaster eq "yes"){
   getStats();
}else{
   # If you're here, you're not the master.  Set "clear" as the variable, and then send it on to the server, exiting normally.
   sendClear();
   sendReport();
   exit 0;
}


#################
## Subroutines ##
#################

# Determines if system is the master MDC Controller
sub isMaster {
   if(-e $masterFile){
      $isMaster = "yes";
   }else{
      $isMaster = "no";
   }
}

# Parses the output of "fsstate" and presents it to Xymon as appropriate values in NCV format
sub getStats {
   my($ACTIVE)  = "$inuse";
   my($OPEN)    = "$free";
   my($WAIT)    = "$delayed";
      head("Fsstate OK");
      msg("&green MDC Fsstate Normal");
      $DATA = "
      Active:$ACTIVE
      Free:$OPEN
      Delayed:$WAIT
              ";
   sendGreen();
   sendReport();
}

# In the event this is not the master server, this empties the variables and sends a "clear" to 
# the server for this host, preventing purple (no data) being sent to the server for this host. 
# All other statuses here simply set the color when called.
sub sendClear {
   $MSG = $DATA = $HEAD = '';
   $COLOR = 'clear';
}

sub sendRed {
   $COLOR = 'red';
}

sub sendYellow {
   $COLOR = 'yellow';
}

sub sendGreen {
   $COLOR = 'green';
}

# This runs the local bb instance and sends the report with all the values necessary to set 
# the Xymon server in the appropriate status
sub sendReport {
   $MACHINE =~ s/\./,/g;
      my($cmd) = "$BB $BBDISP \"status $MACHINE.$TESTNAME $COLOR $DATE $HEAD\n$DATA\n$MSG\"";
      system($cmd);
}

# Format the header 
sub head
{
    $HEAD = "@_";
}


# Clean up messaging a bit
sub msg
{
    $MSG .= join("\n", @_) . "\n";
}

This has only been tested on StorNext MDC servers running on Linux. If you have any issues you need resolved, give me a shout, and I'll see if I can help.

Make the code prettier here in-house, and split our libraries into multiple graphs.

There's a couple snippets of subroutine here and there I yanked from this very site. Since I primarily wrote this “on the fly” to solve a problem, in the heat of the moment I didn't think to write down who and from where. If you see anything in here you recognize, feel free to let me know, and I'll give you proper props for your work.

  • 2010-01-25
    • Initial release
  • monitors/library.txt
  • Last modified: 2010/01/25 21:12
  • (external edit)