monitors:zonemem

zonemem.ksh

Author Vernon Everett
Compatibility Xymon 4.2
Requirements It monitors zones, so I guess Solaris is pretty important.
Download None
Last Update 2013-08-02

I wanted a single unified view of zone memory utilsation, to assist with resource allocation improvements.
Then I added the swap utilisation, and then added graphing too, just for the hell of it.
Under certain circumstances, the graphing is useful, under others, not so much.
Then I added alerting for when thresholds were breached.

NOTE : Only runs in the global zone.

Client side

Copy script to $XYMONCLIENTHOME/ext in the normal way. Make sure it's executable.

Server side

If you want graphing, add the following to xymonserver.cfg

SPLITNCV_zmem="*:GAUGE"

And add the following to TEST2RRD=

zmem=ncv

and GRAPHS=

zmem::4

also in xymonserver.cfg

Then add the following to graphs.cfg

[zmem]
    FNPATTERN ^zmem,(.*).rrd
    TITLE Zone %Memory Utilisation
    YAXIS %
    -l 0
    -u 100
    DEF:p@RRDIDX@=@RRDFN@:lambda:AVERAGE
    LINE2:p@RRDIDX@#@COLOR@:@RRDPARAM@
    GPRINT:p@RRDIDX@:LAST: \: %5.1lf (cur)
    GPRINT:p@RRDIDX@:MAX: \: %5.1lf (max)
    GPRINT:p@RRDIDX@:MIN: \: %5.1lf (min)
    GPRINT:p@RRDIDX@:AVERAGE: \: %5.1lf (avg)\n
    

Show Code ⇲

Hide Code ⇱

#!/usr/bin/ksh
if [ "$(/usr/bin/zonename)" != "global" ]
then
   echo "I only run in global zones."
   exit 1
fi
SWAPRED=90
SWAPYEL=75
MEMRED=95
MEMYEL=80
TEMPFILE=$XYMONTMP/zonemem.tmp
COLOUR=green

ALERT_HIGH_SWAP=true
ALERT_HIGH_MEM=true

typeset -L10 ZONE
typeset -R5 MEMCAP SWAPCAP CAP
typeset -R6 PCT_USED PCTSWAP
typeset -R8 MEMUSED MEMFREE TOTMEMFREE GLOBUSED RSS PCT_USEDG SWAP
# If we count the ZFS cache as free mem, then set this to true.
# If false, we will use the vmstat definition of free memory.
ZFSFREE=true
ALERT=1  # 1= alert on high usage
YELLOW=80
RED=90
INCLUDE_GLOBAL=true

# Now we redefine some variables, if they are set in clientlocal
LOGFETCH=${XYMONTMP}/logfetch.$(uname -n).cfg
if [ -f $LOGFETCH ]
then
   grep "^ZONEMEM:" $LOGFETCH | cut -d":" -f2 \
      | while read NEW_DEF
        do
           $NEW_DEF
        done
fi


size_to_K ()
{
    typeset -R1 UNIT
    SIZE=$1
    UNIT=$SIZE
    case $UNIT in
       K)
          MULTIPLYER=1;;
       M)
          MULTIPLYER=1024;;
       G)
          MULTIPLYER=1048576;;
       T)
          MULTIPLYER=1073741824;;
    esac
    SIZE=${SIZE%?}
    ((SIZE=SIZE*MULTIPLYER))
    echo $SIZE
}

size_K_to_G ()
{
   # Or to M is we add an M parameter.
   if [ "$2" = "M" ]
   then
      GSize=$(echo "$1/1024" | bc)M
   else
      GSize=$(echo "scale=3; $1/1048576+0.005" | bc)
      GSize=${GSize%?}G
   fi
   echo $GSize$UNIT
}

INSTALLED_MEM=$(prtdiag -v | grep Memory)
GLOBMEM=$(echo $INSTALLED_MEM | awk '{ print $3"M" }')
GLOBMEM=$(size_to_K $GLOBMEM)

date > $TEMPFILE
echo >> $TEMPFILE
echo "Global $INSTALLED_MEM" | sed "s/size:/size - /g" >> $TEMPFILE

NUMZONES=$(zoneadm list | wc -l)
NUMZONES=$(echo $NUMZONES | sed 's/^[ \t]*//')
if [ "$ZFSFREE" = "true" ]
then
    MEMFREE_GLOB=$( $XYMONCLIENTHOME/ext/getmemstat.ksh | egrep "^ZFS File Data|^Free" | cut -c18-  | awk '{ sum += $2 } END { print sum }')
    MEMFREE_GLOBK=$(size_to_K ${MEMFREE_GLOB}M)
else
    MEMFREE_GLOBK=$(vmstat 2 2 | tail -1 | awk '{ print $5 }')
fi


echo "                     M E M O R Y                           |          S W A P " >> $TEMPFILE
echo "Zone      MemCap   MemUsed% AvailMem  UsedMem  %of Global  | SwapCap  SwapUsed%  SwapUsed " >> $TEMPFILE
prstat -n 1,10 -Z 1 2 | grep -v "^Total" | tail -${NUMZONES} | grep -v "global" | \
while read ID PROC SWAP RSS MEM TIME CPU ZONE
do
   ALERT_TAG=""
   MEMCAP=$(zonecfg -z $ZONE info capped-memory | grep physical | cut -d: -f2)
   SWAPCAP=$(zonecfg -z $ZONE info capped-memory | grep swap | cut -d: -f2 | sed 's/]//g')
   MEMUSED=$(size_to_K $RSS)
      #Add the .005 to simulate rounding when we are going to be trunking.
   PCT_USEDG=$(echo "scale =3; ((${MEMUSED}*100/${GLOBMEM})+0.005)" | bc)
   PCT_USEDG=${PCT_USEDG%?}
   if [ -z $MEMCAP ]
   then
      MEMFREE=$MEMFREE_GLOBK
      PCT_USED=$PCT_USEDG
      CAP="n/a"
      MEMFREEL=$MEMFREE_GLOBK
      PCT_USED=$PCT_USEDG
      # No cap, we look at the total free.
   else
      # With a cap we need to compare total used to the cap
      MEML=$(size_to_K $MEMCAP)
      MEMFREEL=$(echo ${MEML}-${MEMUSED} | bc)
      MEMFREE=$MEMFREEL
      PCT_USED=$(echo "scale=3; (${MEMUSED}*100/${MEML})+0.005" | bc)
      #Add the .005 to simulate rounding when we are going to be trunking.
      PCT_USED=${PCT_USED%?}
      CAP="$MEMCAP"
   fi
   TOTMEMFREE=$(size_K_to_G $MEMFREE M)

   GSWAP=$(swap -s | awk '{ print $2 }' | sed "s/k$//g")
   if [ -z $SWAPCAP ]
   then
      SWAPCAP="n/a"
      TOTSWAPK=$GSWAP
      # No cap, we look at the total free.
   else
      TOTSWAPK=$(size_to_K $SWAPCAP)
      # With a cap we need to compare total used to the cap
   fi
   SWAPK=$(size_to_K $SWAP)
   PCTSWAP=$(echo "scale=3; (${SWAPK}*100/${TOTSWAPK})+0.005" | bc)
   PCTSWAP=${PCTSWAP%?}
   if [ "$ALERT_HIGH_SWAP" = "true" ]
   then
      [ $PCTSWAP -gt $SWAPYEL ] && ALERT_TAG="yellow"
      [ $PCTSWAP -gt $SWAPRED ] && ALERT_TAG="red"
   fi
   if [ "$ALERT_HIGH_MEM" = "true" ]
   then
      [ $PCT_USED -gt $MEMYEL -a "$ALERT_TAG" != "red" ] && ALERT_TAG="yellow"
      [ $PCT_USED -gt $MEMRED ] && ALERT_TAG="red"
   fi
   [ ! -z  "$ALERT_TAG" -a "$COLOUR" != "red" ] && COLOUR=$ALERT_TAG
   [ ! -z "$ALERT_TAG" ] && ALERT_TAG="&$ALERT_TAG"
   echo "$ZONE $CAP    ${PCT_USED}% ${TOTMEMFREE} $RSS   ${PCT_USEDG}%  |  $SWAPCAP     ${PCTSWAP}%  $SWAP $ALERT_TAG" >> $TEMPFILE.mem
done
sort $TEMPFILE.mem >> $TEMPFILE
$XYMON $XYMSRV "status $MACHINE.zmem $COLOUR $(cat $TEMPFILE)"
$XYMON $XYMSRV "data $MACHINE.zmem $(echo; cat $TEMPFILE.mem | grep -v "Global " | awk '{ print $1":"$3 }' | sed 's/\%$//g'; echo; echo "Ignore this")"

rm $TEMPFILE
rm $TEMPFILE.mem

This is pretty “rough and ready”.
It hasn't been code reviewed, or even looked at by anybody else.
I can safely say, there are no known bugs. LOL
But, we all know the rules. Any useful program will contain at least one variable, one loop and one bug.
I know where to find the variables and the loops.
I leave finding the bug(s) as an exercise for the reader.

If you find any, please let me know.

Find that last unknown bug.

All my own work.
Nobody else to blame.

  • 2013-08-02
    • Initial release
  • monitors/zonemem.txt
  • Last modified: 2013/08/02 05:50
  • by vernon