monitors:retds

RetDS

Author Wim Nelis
Compatibility Xymon 4.2
Requirements Perl
Download None
Last Update 2015-03-17

Script retds.pl retrieves the DNS statistics of both BIND named servers (both the old, pre-9.6 style and the new style) and windows DNS servers. A table within this script specifies which statistics are to be extracted and defines the RRD file name and the DS name of these statistics at the same time. The DNS servers to be monitored by this script are defined in the Xymon hosts.cfg file, using keyword RNAMED.

Script retds.pl is a rewrite (in Perl) of script xymon-rnamedstats.sh written by Jerimy Laidman. It is extended to handle Windows DNS servers as well.

Script retds.pl is a server-side script. In keyword RNAMED the parameter 'cmd' is used to retrieve the statistics from the DNS server. It's usage for BIND and it's usage for Windows DNS are different. In case of BIND, the command specified should build a session to a shell running on the server involved. Then the command “cat <statsfile>” is sent to this shell to retrieve the statistics. In case of Windows DNS, the command should result in the statistics to be written to standard output.

Although script retds.pl is a server-side script, there is some work to do at the client-side as well.

Client side on a BIND server

For a BIND server, periodically the BIND statistics need to be extracted and saved in a file, which can be retrieved by script retds.pl. Typically there will be an entry in crontab for user root which drops the statistics in a file, which is readable by a non-root user.

For example, the statistics could be collected using the following crontab entry of user root:

  0-55/5 * * * * /usr/bin/rdnc stats ; mv /var/named/data/named_stats.txt ~AUser/named.stats

Client side on a Windows DNS server

For a Windows DNS server, the command specified in keyword RNAMED should deliver the DNS statistics to standard output. A Windows DNS server does not include the time of retrieval in the statistics, but this information is needed to confirm that the process collecting the statistics is still refreshing the statistics. Therefore, the date and the time need to be added. The first line of the result should contain the date at which the statistics are collected, and the second line the time of day. The following lines contain the output of the `dnsstat` command.

One possible solution is to use a special webpage to publish the DNS statistics. The statistics can be collected using the following script, which is invoked every 5 minutes:

  @echo off
  date /t > c:\temp\dnsstat.txt
  time /t >> c:\temp\dnsstat.txt
  dnscmd %COMPUTERNAME% /statistics 0x40010x >> c:\temp\dnsstat.txt
  copy c:\temp\dnsstat.txt c:\inetpub\wwwroot\dnsstat.txt

Server side

Script detds.pl will typically be placed in directory $XYMONHOME/ext. It will be run periodically by adding the following snippet to $XYMONHOME/etc/tasks.cfg:

  [retds]
      ENVFILE $XYMONHOME/etc/xymonserver.cfg
      CMD $XYMONHOME/ext/retds.pl
      LOGFILE $XYMONSERVERLOGS/retds.log
      INTERVAL 5m

In file $XYMONHOME/etc/hosts.cfg one needs to identify the hosts for which this script should extract the DNS usage statistics. For a BIND server, two tags need to be added or modified, RNAMED and TRENDS. Matching the example above, the following values could be used:

  RNAMED:”cmd(ssh -T AUser@%{H} 2>/dev/null),statsfile(named.stats)”
  TRENDS:*,bindstats

The same tags need to be specified for a Windows DNS server. Again, matching the example above the following values could be used:

  RNAMED:”cmd(wget http://%{H}/dnsstat.txt -o /dev/null -O -),source(dnscmd)”
  TRENDS:*,wdnsstats

In file $XYMONHOME/etc/xymonserver.cfg the text “,bindstats” needs to be added to the definition of the variables TEST2RRD and GRAPHS if there is at least one BIND server being monitored. The text “,wdnsstats” needs to be added to the same variables in case at least one Windows DNS server is monitored.

Show Code ⇲

Hide Code ⇱

#!/usr/bin/perl -w
#
# This script collects statistics about the use of DNS servers. It can handle
# statistics of both BIND (named) and Windows DNS servers. The script gets the
# statistics over a remote connection, either one which delivers the output on
# standard output, or one which provides a shell prompt. It feeds the results
# into Xymon.
#
# This script is inspired by script xymon-rnamedstats.sh written by Jerimy
# Laidman. This script uses (almost) the same format for the parameters in the
# xymon hosts configuration file. It also generates similar names for the RRD
# files and the same DS names.
#
# Hosts to be queried for DNS-service statistics are flagged with the RNAMED
# key in the Xymon hosts.cfg file. This key allows for parameters to be
# specified. The format is of the RNAMED key is either
#  RNAMED:"parametername(parametervalue)[,parametername(parametervalue)]"
# or
#  "RNAMED:parametername(parametervalue)[,parametername(parametervalue)]"
#
# Allowed parameters are:
#  cmd(<commandline>)    : shell command to build a connection
#  source((bind|dnscmd)) : selection of source, bind or dnscmd
#  statsfile(<filename>) : name of the named statistics file
#  testname(<testname>)  : name of the test, by default "trends"
#  title(<testtitle>)    : a one-line title of the (non-trends) test
#
# In the command line, a few substitutable variables may be specified. They are
# replaced by their current value upon querying the host:
#  %{H} : Host name as defined in the xymon hosts.cfg file
#  %{h} : Host name, but without any domain name
#  %{I} : IP address as defined in the xymon hosts.cfg file
#
# Note that this script is memoryless by design. This implies that it is not
# possible to check the query-rate against a threshold, as the rate is not
# known in this script. The rate is determined by RRD.
#
# Note with respect to pre-BIND 9.6 statistics:
#  success 	The number of successful queries made to the server or zone.
#		A successful query is defined as query which returns a NOERROR
#		response with at least one answer RR.
#  referral 	The number of queries which resulted in referral responses.
#  nxrrset 	The number of queries which resulted in NXRRSET responses
#		with no data.
#  nxdomain 	The number of queries which resulted in NXDOMAIN responses.
#  failure 	The number of queries which resulted in a failure response
#		other than those above.
#  recursion 	The number of queries which caused the server to perform
#		recursion in order to find the final answer.
#
#  Each query received by the server will cause exactly one of success,
#  referral, nxrrset, nxdomain, or failure to be incremented, and may
#  additionally cause the recursion counter to be incremented too.
#
# Written by W.J.M. Nelis, wim.nelis@nlr.nl, 2012.12
#
use strict ;
use Time::Piece ;			# Format time
use Time::Local ;

#
# Installation constants.
# -----------------------
#
# Define the level of debugging output to the standard output / logfile:
#  0= none, 1= input and output, 2= intermediate results too, 3= name mapping too
#
my $Debug = 0 ;				# Flag: Enable debug output

#
# Define the parameters to reach the Xymon server.
#
my $XyDisp= $ENV{XYMONSERVERHOSTNAME} ;	# Name of monitor server
my $XySend= $ENV{XYMON} ;		# Monitor interface program
my $XyHome= $ENV{XYMONHOME} ;		# Home directory
my $FmtDate= "%Y.%m.%d %H:%M:%S" ;	# Default date format
   $FmtDate= $ENV{XYMONDATEFORMAT} if exists $ENV{XYMONDATEFORMAT} ;
#
# Define the command to send to Xymon to retrieve the current configuration
# lines containing the bindstat test parameters.
#
my $XyGrep= "$XyHome/bin/xymongrep RNAMED:*" ;	

#
# Define the default values for the parameters which can be specified with the
# RNAMED keyword in the Xymon hosts configuration file.
#
my $DefPar= { source    => 'bind',
	      statsfile => '/var/named/named_stats.txt',
	      testname  => 'trends',
	      title	=> 'DNS statistics'
	    } ;

#
# Define the mapping of the long, hierarchical names of the statistics onto
# pairs of an RRD file name and a shorter, RRD-compatible name.
#
my %MapName= (
    'OldStyle.success'   => [ 'bindstats.rrd', 'RSqrysuccess'    ],
    'OldStyle.referral'  => [ 'bindstats.rrd', 'RSqryreferral'   ],
    'OldStyle.recursion' => [ 'bindstats.rrd', 'RSqryrecursion'  ],
    'OldStyle.nxrrset'   => [ 'bindstats.rrd', 'RSrcodenxrrset'  ],
    'OldStyle.nxdomain'  => [ 'bindstats.rrd', 'RSrcodenxdomain' ],
    'OldStyle.failure'   => [ 'bindstats.rrd', 'RSrcodefailure'  ],

    'Name_Server_Statistics.queries_resulted_in_successful_answer' => [ 'bindstats.rrd', 'RSqrysuccess'    ],
    'Name_Server_Statistics.queries_resulted_in_referral_answer'   => [ 'bindstats.rrd', 'RSqryreferral'   ],
    'Name_Server_Statistics.queries_caused_recursion'              => [ 'bindstats.rrd', 'RSqryrecursion'  ],
    'Name_Server_Statistics.queries_resulted_in_nxrrset'           => [ 'bindstats.rrd', 'RSrcodenxrrset'  ],
    'Name_Server_Statistics.queries_resulted_in_NXDOMAIN'          => [ 'bindstats.rrd', 'RSrcodenxdomain' ],
    'Name_Server_Statistics.other_query_failures'                  => [ 'bindstats.rrd', 'RSrcodefailure'  ],

    'Queries.Total.Standard'					=> [ 'wdnsstats.rrd', 'Query'     ],
    'Recursion.Query.Queries_Recursed'				=> [ 'wdnsstats.rrd', 'Recurse'   ],
    'Recursion.Failures.Recurse_Failures'			=> [ 'wdnsstats.rrd', 'RcrsFail'  ],
    'Packet_Dynamic_Update.Updates_Received.Updates_Received'	=> [ 'wdnsstats.rrd', 'DynUpdRcv' ],
    'Packet_Dynamic_Update.Updates_Received.Rejected'		=> [ 'wdnsstats.rrd', 'DynUpdRej' ],
    'Error_Stats.UNDEF.ServFail'				=> [ 'wdnsstats.rrd', 'ServFail'  ],
    'Error_Stats.UNDEF.NxDomain'				=> [ 'wdnsstats.rrd', 'NxDomain'  ],
    'Error_Stats.UNDEF.NxRRSet'					=> [ 'wdnsstats.rrd', 'NxRRSet'   ],
   ) ;

#
# Define the printf format to write a statistic, DS name and value, into the
# 'trends' message for Xymon. All counters are treated as DERIVE, rather than
# COUNTER, in order to avoid huge spikes whenever the BIND service is restarted.
# A restart will result in a negative value, which is suppressed from the graph
# by setting the minimal value to zero.
#
my $DsDef= "DS:%s:DERIVE:600:0:U %d\n" ;


#
# Global variables.
# -----------------
#
my $Result= '' ;			# Status message to Xymon
my @Work  = () ;			# Parameters from Xymon hosts config
my %Stat  = () ;			# Bind statistics
my @Lines = () ;			# Just a bunch of line images
my $I ;					# Loop control variable

#
# Function LogMessage is invoked to output a debugging message.
#
sub LogMessage($) {
  my $Msg= shift ;  chomp $Msg ;	# Line without end-of-line
  my $Now= localtime ;			# Build object with UTS
     $Now= $Now->strftime( "%Y.%m.%d %H:%M:%S" ) ;

  my $Clr= (caller(1))[3] ;		# Name of calling function
  $Clr= 'MAIN'			unless defined $Clr ;
  $Clr=~ s/^.+\:\:// ;			# Remove package name

  print "$Now $Clr: $Msg\n" ;
}  # of LogMessage

#
# Function BuildEpochTime takes two strings, one describing a date, the other
# one describing a time. It returns the associated epoch time.
#
sub BuildEpochTime($$) {
  my @Time= ( 0 ) ;			# Elements of date and time
  my $Time ;				# Time stamp

  push @Time, $2, $1		if $_[1]=~ m/^\s?(\d+):(\d+)/ ;
  push @Time, $1, $2, $3	if $_[0]=~ m/^(?:[a-z]+)?\s?(\d+)-(\d+)-(\d+)/i ;
  return 0			unless scalar(@Time) == 6 ;
  $Time[4]-- ;				# Adjust month ordinal
  return timelocal( @Time ) ;		# Timestamp 
}  # of BuildEpochTime

#
# Function BuildWorkList takes the lines from the Xymon hosts configuration
# file, extracts the relevant parts and saves them in list @Work.
#
sub BuildWorkList() {
  my ($IP,$HN,$AllPars) ;		# Host parameters
  my ($RNamed,$Pars) ;			# RNAMED parameters
  my $W ;				# Ref to element of @Work

  $I= -1 ;
  foreach ( @Lines ) {
    $I++ ;  %{$Work[$I]}= %$DefPar ;	# Copy default values
    $W= $Work[$I] ;			# Short cut
    $$W{ERROR}= 0 ;			# No error found (yet)
    LogMessage( "Input: $_" )	if $Debug ;

 # Handle the fixed Xymon parameters, the IP address and the name of the host.
    chomp ;
    ($IP,$HN,$AllPars)= m/^\s*([\d\.]+)\s+([\w\.]+)\s+#\s*(.*?)\s*$/ ;
    $$W{IP}= $IP ;  $$W{Host}= $HN ;	# Save host parameters

 # Handle keyword RNAMED: extract all its parameters. Two formats are allowed,
 # one in which the whole parameter string is enclosed between double quotes and
 # one in which the part after RNAMED: is enclosed between double quotes.
    ($RNamed,$Pars)= $AllPars=~ m/\"(RNAMED:(.+?))\"/ ;
    unless ( defined $RNamed ) {
      ($RNamed,$Pars)= $AllPars=~ m/(RNAMED:\"(.+?))\"/	;
      $RNamed=~ s/^RNAMED:\"/RNAMED:/ ;	# Remove double quote
    }  # of unless
    LogMessage( "RNamed: $RNamed" )		if $Debug > 1 ;

    while ( $Pars=~ s/^([a-z]+)\((.+?)\)(?:\s*,\s*)?// ) {
      $$W{$1}= $2 ;
      LogMessage( "RNAMED par: $1 = $2" )	if $Debug > 1 ;
    }  # of while
    if ( $Pars ) {
      Logmessage( "Error at $HN in \"$RNamed\"" ) ;
      $$W{ERROR}= 1 ;
    }  # of unless
  }  # of foreach
}  # of BuildWorkList

#
# Function PolishName returns the input string, after replacing all
# non-alphanumeric characters by an underscore.
#
sub PolishName($) {
  my $Name= $_[0] ;			# Fetch name
  $Name=~ tr/-_0-9a-zA-Z/_/c ;
  $Name=~ s/_{2,}//g ;
  LogMessage( " \"$_[0]\" into \"$Name\"" )	if $Debug > 2 ;
  return $Name ;
}  # of PolishName

#
# Function Recent takes a time stamp and returns that value if the time stamp
# lies between now and 10 minutes ago. If the clock of the DNS server is up to
# 10 seconds ahead wrt the time on the Xymon server, the current time at the
# Xymon server is returned. In all other cases it will return a false value.
#
sub Recent($$) {
  my $Age= time - $_[1] ;			# Age
  if ( $Age < -10 ) {
    LogMessage( "$_[0]: Statistics too young" ) ;
    return 0 ;					# Return a false value
  } elsif ( $Age < 0 ) {
    return time;				# Correct for time slack
  } elsif ( $Age > 600 ) {
    LogMessage( "$_[0]: Statistics too old" ) ;
    return 0 ;					# Return a false value
  } else {
    return $_[1] ;				# Return a true value
  }  # of else
}  # of Recent

#
# Function QueryServer retrieves the dns statistics from one server. All
# information needed is passed in the work list entry. The retrieved information
# is written to global list @Lines.
#
# The script to be executed at the (remote) bind server is stripped to the
# minimum. Add error detection and a session time limit.
#
sub QueryServer($) {
  my $W= shift ;			# Ref to worklist item

  my $SHost= (split(/\./,$$W{Host}))[0] ;	# Short host name
  my $Cmd  = $$W{cmd} ;			# Retrieve command
  if ( index($Cmd,' ') < 0 ) {
    $Cmd.= " $$W{Host}" ;		# Append hostname
  } else {
    $Cmd=~ s/%\{H\}/$$W{Host}/g ;	# Substitute full host name
    $Cmd=~ s/%\{h\}/$SHost/g ;		# Substitute short host name
    $Cmd=~ s/%\{I\}/$$W{IP}/g ;		# Substitute IP address
  }  #of else

  if ( $$W{source} eq 'bind' ) {
    my $StdIn= "{ echo \"cat $$W{statsfile}\" ; echo \"exit\" ; }" ;
    $Cmd= "$StdIn | $Cmd" ;		# Build script
  }  # of if

  LogMessage( " Cmd = $Cmd" )	if $Debug ;
  @Lines= `$Cmd` ;			# Retrieve standard output
}  # of QueryServer

#
# Function ParseBindStatistics takes the raw statistics of a BIND server and
# stores it in a multi-level data-structure.
#
sub ParseBindStatistics($) {
  my $W= shift ;			# Ref to worklist item
  my $Section= '' ;			# Hierarchy of statistics
  my $View   = '' ;
  my ($Var,$Val) ;

  foreach ( @Lines ) {
    chomp ;
    next		if m/^\s*$/ ;	# Skip empty line
    next		if m/^---/ ;	# Skip end-of-statistics line
    next		if m/^\[---.*---\]$/ ;


    if ( m/^\+\+\+ Statistics Dump \+\+\+ \((\d+)\)/ ) {
      $Val= Recent( $$W{Host}, $1 ) ;	# Check time of measurement
      return		unless $Val ;	# Return if not a recent sample
      $Stat{ToM}= $Val ;		# Save time of measurement
      $Section= 'OldStyle' ;		# Assume pre BIND-9.6
      $View   = '' ;

    } elsif ( m/^\+\+\s+(.+?)\s*\+\+$/ ) {
      $Section= PolishName( $1 ) ;
      $View   = '' ;
    } elsif ( m/^\[Common\]$/ ) {
      $View   = 'Common' ;
    } elsif ( m/^\[View:\s+(.+)\]$/ ) {
      $View   = $1 ;

    } elsif ( m/^\s*(\d+)\s+(.+?)\s*$/ ) {
      $Val= $1 ;  $Var= PolishName( $2 ) ;
      $Stat{Long}{$Section}{$View}{$Var}= $Val ;
      LogMessage( " $Section - $View - $Var = $Val" )	if $Debug ;
    } elsif ( m/^\s*([a-z]+)\s+(\d+)\s*$/ ) {
      $Var= PolishName( $1 ) ;  $Val= $2 ;
      $Stat{Long}{$Section}{$View}{$Var}= $Val ;
      LogMessage( " $Section - $View - $Var = $Val" )	if $Debug ;

    } else {
      LogMessage( "Error - Unexpected BIND statistics line at $$W{Host}" ) ;
      LogMessage( "        $_" ) ;
      $$W{ERROR}= 1 ;
    }  # of else
  }  # of foreach

}  # of ParseBindStatistics

#
# Function ParseDnscmdStatistics uses the hierarchical structure of the Windows
# DNS service statistics to identify and extract all values.
#
# The statistics are divided into sections, each section is divided in one or
# more views. A view consists of one or more lines containing the name of a
# variable and its value. A section starts with the section name in the leftmost
# column and is followed by a line of dashes. A view has the same format as a
# section, but it is not followed by a line of dashes. Moreover, a view can be
# given a value. This is considered to be a shorthand notation, that is
#   <AView> = <AValue>
# is considered to be equivalent to
#   <AView>:
#      <Aview> = <AValue>
# Within a view two levels of variables are defined, a top-level and a
# sub-level. The indentation of a top-level line is short, the indentation of
# a sub-level line is longer. Each section has its own indentation settings.
# Sub-level lines are ignored.
#
# The statistics are augmented with a time stamp at the start. The first line
# specifies the date, using format dd-mm-yyyy, the second line the time of
# collecting the statistics, using format hh:mm.
#
sub ParseDnscmdStatistics($) {
  my $W= shift ;			# Ref to worklist entry
  my $Candidate= undef ;		# Name of section or view
  my $Section  = undef ;		# Name of section
  my $View     = undef ;		# Name of view
  my $Variable = undef ;		# Name of variable
  my $Value    = undef ;		# Its value
  my $Indent0  = undef ;		# RE for top-level indent
  my $Indent1  ;			# RE for sub-level indent

  return			if @Lines < 4 ;
  my $Date= shift @Lines ;		# Fetch time of this set of
  my $Time= shift @Lines ;		#   statistics
  $Time= BuildEpochTime( $Date, $Time ) ;
  $Time= Recent( $$W{Host}, $Time ) ;
  return		unless $Time ;	# Exit if sample is outdated
  $Stat{ToM}= $Time ;			# Save time of retrieval

  foreach ( @Lines ) {
    chomp ;				# Remove trailing Lf
    s/\cM$// ;				# Remove trailing Cr
    s/\(.+?\)// ;			# Remove (text)
    next		if m/^\s*$/ ;	# Skip an empty line

 # Handle a name found on the previous line, which can be either the name of
 # a section of the name of a view.
    if ( defined $Candidate ) {
      if ( m/^\-+\s*$/ ) {
	$Section= $Candidate ;
	$Candidate= undef ;
	$View     = undef ;
	$Indent0  = undef ;
	$Variable = undef ;
	next ;
      } else {
	$View= $Candidate ;
	$Candidate= undef ;
	$Variable = undef ;
      }  # of else
    }  # of if

 # Handle the (potential) section header.
    if      ( m/^([A-Z][\w ]+?)\s*:?\s*$/ ) {
      $Candidate= PolishName( $1 ) ;

 # Handle the view header.
#   } elsif ( m/^([A-Z][\w ]+):\s*$/ ) {
#     $Candidate= PolishName( $1 ) ;
    } elsif ( m/^([A-Z][\w ]+?)\s*=\s+(\d+)\s*$/ ) {
      $View= PolishName( $1 ) ;
      $Variable= $View ;
      $Value   = $2 ;
      $Stat{Long}{$Section}{$View}{$Variable}= $Value ;
      LogMessage( " $Section - $View - $Variable = $Value" )	if $Debug ;

 # Handle a line, either a top-level of a sub-level definition.
    } elsif ( ! defined $Indent0 ) {
      if ( m/^(\s+)([A-Z][\w\- ]+?)\s*=\s*(\d+)$/ ) {
	$Indent0 = $1 ;			# Indentation of top-level
	$Indent1 = $1 . '\s+' ;		# Indentation of sub-level
	$Variable= PolishName( $2 ) ;
	$Value   = $3 ;
	$View= 'UNDEF'		unless defined $View ;
	$Stat{Long}{$Section}{$View}{$Variable}= $Value ;
	LogMessage( " $Section - $View - $Variable = $Value" )	if $Debug ;
      } else {
	LogMessage( "Error - Unexpected WinDns statistics line at $$W{Host}" ) ;
	LogMessage( "        $_" ) ;
	$$W{ERROR}= 1 ;
      }  # of else
    } elsif ( m/^$Indent0([A-Z][\w\- ]+?)\s*=\s*(\d+)\s*$/ ) {
      $Variable= PolishName( $1 ) ;
      $Value   = $2 ;
      $View= 'UNDEF'		unless defined $View ;
      $Stat{Long}{$Section}{$View}{$Variable}= $Value ;
      LogMessage( " $Section - $View - $Variable = $Value" )	if $Debug ;
    } elsif ( m/^$Indent0[A-Z][\w\- ]+:?\s*$/ ) {
      next ;				# Ignore another deeper level
    } elsif ( m/^$Indent1[A-Z]/ ) {
      next ;				# Ignore this deep statistic

 # Handle the remaining cases.
    } elsif ( m/^Command completed successfully./ ) {
      last ;				# End of statistics found
    } else {
      LogMessage( "Error - Unexpected WinDns statistics line at $$W{Host}" ) ;
      LogMessage( "        $_" ) ;
      $$W{ERROR}= 1 ;
    }  # of else
  }  # of foreach

}  # of ParseDnscmdStatistics

#
# Function ParseStatistics invokes the appropriate parser, depending on the
# source of the statistical information.
#
sub ParseStatistics($) {
  my $W= shift ;			# Ref to work list item

  if      ( $$W{source} eq 'bind'   ) {
    ParseBindStatistics( $W ) ;
  } elsif ( $$W{source} eq 'dnscmd' ) {
    ParseDnscmdStatistics( $W ) ;
  }  # of elsif
}  # of ParseStatistics

#
# Function MapName takes the results of one BIND server. It maps the (long)
# names of the variables onto short ones, which are usable as DS-names in an
# RRD file. At the same time, a default value is given for unknown variables.
#
sub MapName() {
  my $LongName ;			# Fully qualified name
  my $M ;				# Ref into statistics result hash

  foreach my $Section  ( keys %{$Stat{Long}} ) {
    foreach ( keys %MapName ) {
      next			unless index($_,$Section) == 0;
      $M= $MapName{$_} ;
      $Stat{Short}{$$M[0]}{$$M[1]}= 0 ;
    }  # of foreach
 #
    foreach my $View   ( keys %{$Stat{Long}{$Section}} ) {
      foreach my $Stat ( keys %{$Stat{Long}{$Section}{$View}} ) {
	$LongName= "$Section.$View.$Stat" ;
	$LongName=~ s/\.\./\./ ;	# Remove empty view name
	next			unless exists $MapName{$LongName} ;

	LogMessage( "save \"$LongName\"" )	if $Debug > 1 ;
	$M= $MapName{$LongName} ;	# Ref into mapping
	$Stat{Short}{$$M[0]}{$$M[1]}= $Stat{Long}{$Section}{$View}{$Stat} ;
      }  # of foreach
    }  # of foreach
  }  # of foreach
}  # of MapName

#
# Build the Xymon message and inform Xymon. The message can be sent in one of
# two formats: a 'trends' message or a 'status' message. The former includes
# the name of the RRD file and the DS definition, requiring no additional config
# of Xymon. The latter shows up as a separate column with a graph, and requires
# the use of an additional server-side script (and configuration) to distribute
# the NCV data to the RRD files.
#
sub InformXymon($) {
  my $Work  = shift ;			# Ref to work descriptor
  my $XyTest= $$Work{testname} ;	# Name of test
  my $XyHost= $$Work{Host} ;		# Name of BIND server
  my $Colour= 'green' ;			# Initial test status
  my $Now ;				# Time of retrieval of statistics
  my ($Rrd,$DS) ;			# Loop control variables
  my $Key ;

  if ( $XyTest ne 'trends' ) {
    $Now= localtime( $Stat{ToM} ) ;
    $Now= $Now->strftime( $FmtDate ) ;
 #
 # Save the results in NCV format, using extended names which include the name
 # of the RRD file. In Xymon, this data is channeled to a script defined in
 # the --extra-script parameter of rrdstatus.
 #
    $Result = "<!--\n" ;
    foreach $Rrd ( sort keys %{$Stat{Short}} ) {
      foreach $DS ( sort keys %{$Stat{Short}{$Rrd}} ) {
        $Key= $Rrd ;  $Key=~ s/\.rrd$// ;
	$Result.= "$Key/$DS : $Stat{Short}{$Rrd}{$DS}\n" ;
      }  # of foreach
    }  # of foreach
    $Result.= "-->" ;

    $Result = "\"status $XyHost.$$Work{testname} $Colour $Now\n" .
	      "$$Work{title}\n" .
	      $Result . "\"\n" ;
    `$XySend $XyDisp $Result` ;		# Inform Xymon

  } else {				# Format is 'trends'
 #
 # Save the results in the special 'trends message' format. It consists of one
 # or more sections formatted like:
 #   [<RrdFileName>]
 #   DS:<DsSpecification> <DsValue>
 #
    $Result= '' ;
    foreach $Rrd ( sort keys %{$Stat{Short}} ) {
      next			unless keys %{$Stat{Short}{$Rrd}} ;
      $Result.= "[$Rrd]\n" ;
      foreach $DS ( sort keys %{$Stat{Short}{$Rrd}} ) {
	$Result.= sprintf( $DsDef, $DS, $Stat{Short}{$Rrd}{$DS} ) ;
      }  # of foreach
    }  # of foreach

    if ( $Result ne '' ) {
      $Result ="\"data $XyHost.trends\n" . $Result . "\"\n" ;
      LogMessage( "XySend: $Result" )	if $Debug ;
      `$XySend $XyDisp $Result` ; 	# Inform Xymon
    }  # of if
  }  # of else
}  # of InformXymon


#
# MAIN PROGRAM.
# -------------
#
unless ( defined $XyDisp ) {
  LogMessage( 'Error: no Xymon environment defined' ) ;
  exit ;
}  # of unless

@Lines= `$XyGrep` ;			# Get list of DNS servers
exit			unless @Lines ;	# Stop if nothing to do
BuildWorkList ;				# Build nice list of work to do

foreach ( @Work ) {
  next			if $$_{ERROR} ;	# Skip in case of RNAMED syntax error
  %Stat= () ;				# Clear statistics save area
  QueryServer( $_ ) ;			# Query next server
  ParseStatistics( $_ ) ;		# Extract statistics and format them
  MapName ;				# Map long name onto short RRD name
  InformXymon( $_ ) ;			# Send data to Xymon
}  # of foreach

If the statistics are sent to Xymon using a status message, a script like the one below is needed to parse the NCV data and pass them to an RRD.

Show Code ⇲

Hide Code ⇱

#!/usr/bin/perl
#
# This script handles a list of NCVs, send by a Xymon client, and prepares it to
# be stored in an RRA. This script is used in cases in which a fixed-size group
# of two or more values should be put together into a single RRA. The algorithm
# is specific for each test / client.
#
# This script is invoked with three parameters: the name of the host, the name
# of the test and the name of the file containing the message sent by the
# client, containing the NCVs to be handled.
#
use strict;

#
# Installation constants.
# -----------------------
#
# %Struct defines the datasets of the various tests.
#
my %Struct= (
	bindstats => [		# Must be sorted!
		"DS:RSqryrecursion:DERIVE:600:0:U\n" ,
		"DS:RSqryreferral:DERIVE:600:0:U\n" ,
		"DS:RSqrysuccess:DERIVE:600:0:U\n" ,
		"DS:RSrcodefailure:DERIVE:600:0:U\n" ,
		"DS:RSrcodenxdomain:DERIVE:600:0:U\n" ,
		"DS:RSrcodenxrrset:DERIVE:600:0:U\n" ],
   ) ;  # of %Struct

#
# Global variables.
# -----------------
#
my ( $HostName, $TestName, $FileName )= @ARGV ;
#
my %Var= () ;				# Save area measurements
my ( $Line, @Line ) ;			# List of values of one measurement
my $key ;				# Loop control variable


#
# Main program.
# -------------
#

#
# An attempt has been undertaken to make this code a little bit more general.
# The name of an NCV should consist of two names separated by "/". The first
# name becomes (part of) the name of the RRA, the second name becomes the
# name of the DS. The DS-ses are written in sorted order.
#
if ( $TestName eq 'bindstats' ) {
  open( FH, "<", $FileName )	or die ;
  while ( <FH> ) {
    chomp ;
    next			unless m/^([\w\.\,-]+)\/(\w+)\s+:\s+(U|[\d\.]+)\s*$/ ;
    $Var{$1}{$2}= $3 ;
  }  # of while
  close( FH ) ;

  print @{$Struct{$TestName}} ;
  foreach $key ( sort keys %Var ) {
    @Line= () ;
    push @Line, $Var{$key}{$_}	foreach ( sort keys %{$Var{$key}} ) ;
    if ( $TestName eq $key ) {
      print "$TestName.rrd\n" ;
    } else {
      print "$TestName.$key.rrd\n" ;
    }  # of else
    print join( ":", @Line ) . "\n" ;
  }  # of foreach

}  # of if

exit 0 ;

The following snippet defines a graph showing the collected BIND statistics. Add this snippet to $XYMONHOME/etc/graphs.cfg if at least one BIND server is being monitored.

Show Code ⇲

Hide Code ⇱

[bindstats]
	TITLE , Bind query rates  
	YAXIS Rate [q/s]
	-l 0
	DEF:success=bindstats.rrd:RSqrysuccess:AVERAGE
	DEF:referal=bindstats.rrd:RSqryreferral:AVERAGE
	DEF:recursn=bindstats.rrd:RSqryrecursion:AVERAGE
	DEF:failure=bindstats.rrd:RSrcodefailure:AVERAGE
	DEF:nxrrset=bindstats.rrd:RSrcodenxrrset:AVERAGE
	DEF:nxdomain=bindstats.rrd:RSrcodenxdomain:AVERAGE
	CDEF:total=success,referal,+,failure,+,nxrrset,+,nxdomain,+
	AREA:success#348017:success
	GPRINT:success:MIN:     Min\: %5.1lf %sq/s
	GPRINT:success:MAX:Max\: %5.1lf %sq/s
	GPRINT:success:AVERAGE:Avg\: %5.1lf %sq/s
	GPRINT:success:LAST:Cur\: %5.1lf %sq/s\n
	AREA:referal#52D017:referral:STACK
	GPRINT:referal:MIN:    Min\: %5.1lf %sq/s
	GPRINT:referal:MAX:Max\: %5.1lf %sq/s
	GPRINT:referal:AVERAGE:Avg\: %5.1lf %sq/s
	GPRINT:referal:LAST:Cur\: %5.1lf %sq/s\n
	AREA:failure#F88017:Err-failure:STACK
	GPRINT:failure:MIN: Min\: %5.1lf %sq/s
	GPRINT:failure:MAX:Max\: %5.1lf %sq/s
	GPRINT:failure:AVERAGE:Avg\: %5.1lf %sq/s
	GPRINT:failure:LAST:Cur\: %5.1lf %sq/s\n
	AREA:nxrrset#C35817:Err-nxrrset:STACK
	GPRINT:nxrrset:MIN: Min\: %5.1lf %sq/s
	GPRINT:nxrrset:MAX:Max\: %5.1lf %sq/s
	GPRINT:nxrrset:AVERAGE:Avg\: %5.1lf %sq/s
	GPRINT:nxrrset:LAST:Cur\: %5.1lf %sq/s\n
	AREA:nxdomain#FF0000:Err-nxdomain:STACK
	GPRINT:nxdomain:MIN:Min\: %5.1lf %sq/s
	GPRINT:nxdomain:MAX:Max\: %5.1lf %sq/s
	GPRINT:nxdomain:AVERAGE:Avg\: %5.1lf %sq/s
	GPRINT:nxdomain:LAST:Cur\: %5.1lf %sq/s\n
	LINE1:total#000000:Total
	GPRINT:total:MIN:       Min\: %5.1lf %sq/s
	GPRINT:total:MAX:Max\: %5.1lf %sq/s
	GPRINT:total:AVERAGE:Avg\: %5.1lf %sq/s
	GPRINT:total:LAST:Cur\: %5.1lf %sq/s\n
	LINE1:recursn#0000FF:recursion
	GPRINT:recursn:MIN:   Min\: %5.1lf %sq/s
	GPRINT:recursn:MAX:Max\: %5.1lf %sq/s
	GPRINT:recursn:AVERAGE:Avg\: %5.1lf %sq/s
	GPRINT:recursn:LAST:Cur\: %5.1lf %sq/s\n

The following snippet defines a graph showing the collected Windows DNS statistics.

Show Code ⇲

Hide Code ⇱

[wdnsstats]
        TITLE , DNS statistics  
        YAXIS Rate [/s] 
        -l 0 
        DEF:qry=wdnsstats.rrd:Query:AVERAGE
        DEF:upd=wdnsstats.rrd:DynUpdRcv:AVERAGE
        DEF:er0=wdnsstats.rrd:RcrsFail:AVERAGE
        DEF:er1=wdnsstats.rrd:DynUpdRej:AVERAGE
        DEF:er2=wdnsstats.rrd:ServFail:AVERAGE
        DEF:er3=wdnsstats.rrd:NxDomain:AVERAGE
        CDEF:err=er0,er1,+,er2,+,er3,+
        LINE1:qry#66CC33:Query
        GPRINT:qry:MIN: Min\: %5.1lf %s
        GPRINT:qry:MAX:Max\: %5.1lf %s
        GPRINT:qry:AVERAGE:Avg\: %5.1lf %s
        GPRINT:qry:LAST:Cur\: %5.1lf %s\n
        LINE1:upd#9933FF:Update
        GPRINT:upd:MIN:Min\: %5.1lf %s
        GPRINT:upd:MAX:Max\: %5.1lf %s
        GPRINT:upd:AVERAGE:Avg\: %5.1lf %s
        GPRINT:upd:LAST:Cur\: %5.1lf %s\n
        LINE1:err#FF0000:Error
        GPRINT:err:MIN: Min\: %5.1lf %s
        GPRINT:err:MAX:Max\: %5.1lf %s
        GPRINT:err:AVERAGE:Avg\: %5.1lf %s
        GPRINT:err:LAST:Cur\: %5.1lf %s\n

By default, the statistics are presented to Xymon using a so-called 'trends' data-message. These type of messages do not carry a status nor a time stamp. Thus problems in collecting the statistics cannot be passed as a status. The independent polling cycles at client and server introduce variable delays, which cannot be accounted for, although the time of data collection is known at the server-side script.

It is assumed that the BIND statistics file contains only one set of statistics. Thus the new statistics overwrite the old statistics, rather than being appended.

This script is only tested using 'trends'-messages. The function to use a status message to inform Xymon has thus not been tested.

Script retds.pl is written using script xymon-rnamedstats.sh written by Jeremy Laidman as the source for inspiration.

  • 2015-03-17
    • Initial release on xymonton
  • monitors/retds.txt
  • Last modified: 2015/03/18 20:30
  • by jez