monitors:zonemem

This is an old revision of the document!


zonemem.ksh

Author Vernon Everett
Compatibility Xymon 4.2
Requirements It monitors zones, so I guess Solaris is pretty important.
Download None
Last Update 2013-08-02

I wanted a single unified view of zone memory utilsation, to assist with resource allocation improvements.
Then I added the swap utilisation, and then added graphing too, just for the hell of it.
Under certain circumstances, the graphing is useful, under others, not so much.
Then I added alerting for when thresholds were breached.

NOTE : Only runs in the global zone.

Client side

Copy script to $XYMONCLIENTHOME/ext in the normal way. Make sure it's executable.

Server side

If you want graphing, add the following to xymonserver.cfg

SPLITNCV_zmem="*:GAUGE"

And add the following to TEST2RRD=

zmem=ncv

and GRAPHS=

zmem::4

also in xymonserver.cfg

And add the following to graphs.cfg

[zmem]
    FNPATTERN ^zmem,(.*).rrd
    TITLE Zone %Memory Utilisation
    YAXIS %
    -l 0
    -u 100
    DEF:p@RRDIDX@=@RRDFN@:lambda:AVERAGE
    LINE2:p@RRDIDX@#@COLOR@:@RRDPARAM@
    GPRINT:p@RRDIDX@:LAST: \: %5.1lf (cur)
    GPRINT:p@RRDIDX@:MAX: \: %5.1lf (max)
    GPRINT:p@RRDIDX@:MIN: \: %5.1lf (min)
    GPRINT:p@RRDIDX@:AVERAGE: \: %5.1lf (avg)\n
    

Show Code ⇲

Hide Code ⇱

#!/usr/bin/ksh
if [ "$(/usr/bin/zonename)" != "global" ]
then
   echo "I only run in global zones."
   exit 1
fi
SWAPRED=90
SWAPYEL=75
MEMRED=95
MEMYEL=80
TEMPFILE=$XYMONTMP/zonemem.tmp
COLOUR=green

ALERT_HIGH_SWAP=true
ALERT_HIGH_MEM=true

typeset -L10 ZONE
typeset -R5 MEMCAP SWAPCAP CAP
typeset -R6 PCT_USED PCTSWAP
typeset -R8 MEMUSED MEMFREE TOTMEMFREE GLOBUSED RSS PCT_USEDG SWAP
# If we count the ZFS cache as free mem, then set this to true.
# If false, we will use the vmstat definition of free memory.
ZFSFREE=true
ALERT=1  # 1= alert on high usage
YELLOW=80
RED=90
INCLUDE_GLOBAL=true

# Now we redefine some variables, if they are set in clientlocal
LOGFETCH=${XYMONTMP}/logfetch.$(uname -n).cfg
if [ -f $LOGFETCH ]
then
   grep "^ZONEMEM:" $LOGFETCH | cut -d":" -f2 \
      | while read NEW_DEF
        do
           $NEW_DEF
        done
fi


size_to_K ()
{
    typeset -R1 UNIT
    SIZE=$1
    UNIT=$SIZE
    case $UNIT in
       K)
          MULTIPLYER=1;;
       M)
          MULTIPLYER=1024;;
       G)
          MULTIPLYER=1048576;;
       T)
          MULTIPLYER=1073741824;;
    esac
    SIZE=${SIZE%?}
    ((SIZE=SIZE*MULTIPLYER))
    echo $SIZE
}

size_K_to_G ()
{
   # Or to M is we add an M parameter.
   if [ "$2" = "M" ]
   then
      GSize=$(echo "$1/1024" | bc)M
   else
      GSize=$(echo "scale=3; $1/1048576+0.005" | bc)
      GSize=${GSize%?}G
   fi
   echo $GSize$UNIT
}

INSTALLED_MEM=$(prtdiag -v | grep Memory)
GLOBMEM=$(echo $INSTALLED_MEM | awk '{ print $3"M" }')
GLOBMEM=$(size_to_K $GLOBMEM)

date > $TEMPFILE
echo >> $TEMPFILE
echo "Global $INSTALLED_MEM" | sed "s/size:/size - /g" >> $TEMPFILE

NUMZONES=$(zoneadm list | wc -l)
NUMZONES=$(echo $NUMZONES | sed 's/^[ \t]*//')
if [ "$ZFSFREE" = "true" ]
then
    MEMFREE_GLOB=$( $XYMONCLIENTHOME/ext/getmemstat.ksh | egrep "^ZFS File Data|^Free" | cut -c18-  | awk '{ sum += $2 } END { print sum }')
    MEMFREE_GLOBK=$(size_to_K ${MEMFREE_GLOB}M)
else
    MEMFREE_GLOBK=$(vmstat 2 2 | tail -1 | awk '{ print $5 }')
fi


echo "                     M E M O R Y                           |          S W A P " >> $TEMPFILE
echo "Zone      MemCap   MemUsed% AvailMem  UsedMem  %of Global  | SwapCap  SwapUsed%  SwapUsed " >> $TEMPFILE
prstat -n 1,10 -Z 1 2 | grep -v "^Total" | tail -${NUMZONES} | grep -v "global" | \
while read ID PROC SWAP RSS MEM TIME CPU ZONE
do
   ALERT_TAG=""
   MEMCAP=$(zonecfg -z $ZONE info capped-memory | grep physical | cut -d: -f2)
   SWAPCAP=$(zonecfg -z $ZONE info capped-memory | grep swap | cut -d: -f2 | sed 's/]//g')
   MEMUSED=$(size_to_K $RSS)
      #Add the .005 to simulate rounding when we are going to be trunking.
   PCT_USEDG=$(echo "scale =3; ((${MEMUSED}*100/${GLOBMEM})+0.005)" | bc)
   PCT_USEDG=${PCT_USEDG%?}
   if [ -z $MEMCAP ]
   then
      MEMFREE=$MEMFREE_GLOBK
      PCT_USED=$PCT_USEDG
      CAP="n/a"
      MEMFREEL=$MEMFREE_GLOBK
      PCT_USED=$PCT_USEDG
      # No cap, we look at the total free.
   else
      # With a cap we need to compare total used to the cap
      MEML=$(size_to_K $MEMCAP)
      MEMFREEL=$(echo ${MEML}-${MEMUSED} | bc)
      MEMFREE=$MEMFREEL
      PCT_USED=$(echo "scale=3; (${MEMUSED}*100/${MEML})+0.005" | bc)
      #Add the .005 to simulate rounding when we are going to be trunking.
      PCT_USED=${PCT_USED%?}
      CAP="$MEMCAP"
   fi
   TOTMEMFREE=$(size_K_to_G $MEMFREE M)

   GSWAP=$(swap -s | awk '{ print $2 }' | sed "s/k$//g")
   if [ -z $SWAPCAP ]
   then
      SWAPCAP="n/a"
      TOTSWAPK=$GSWAP
      # No cap, we look at the total free.
   else
      TOTSWAPK=$(size_to_K $SWAPCAP)
      # With a cap we need to compare total used to the cap
   fi
   SWAPK=$(size_to_K $SWAP)
   PCTSWAP=$(echo "scale=3; (${SWAPK}*100/${TOTSWAPK})+0.005" | bc)
   PCTSWAP=${PCTSWAP%?}
   if [ "$ALERT_HIGH_SWAP" = "true" ]
   then
      [ $PCTSWAP -gt $SWAPYEL ] && ALERT_TAG="yellow"
      [ $PCTSWAP -gt $SWAPRED ] && ALERT_TAG="red"
   fi
   if [ "$ALERT_HIGH_MEM" = "true" ]
   then
      [ $PCT_USED -gt $MEMYEL -a "$ALERT_TAG" != "red" ] && ALERT_TAG="yellow"
      [ $PCT_USED -gt $MEMRED ] && ALERT_TAG="red"
   fi
   [ ! -z  "$ALERT_TAG" -a "$COLOUR" != "red" ] && COLOUR=$ALERT_TAG
   [ ! -z "$ALERT_TAG" ] && ALERT_TAG="&$ALERT_TAG"
   echo "$ZONE $CAP    ${PCT_USED}% ${TOTMEMFREE} $RSS   ${PCT_USEDG}%  |  $SWAPCAP     ${PCTSWAP}%  $SWAP $ALERT_TAG" >> $TEMPFILE.mem
done
sort $TEMPFILE.mem >> $TEMPFILE
$XYMON $XYMSRV "status $MACHINE.zmem $COLOUR $(cat $TEMPFILE)"
$XYMON $XYMSRV "data $MACHINE.zmem $(echo; cat $TEMPFILE.mem | grep -v "Global " | awk '{ print $1":"$3 }' | sed 's/\%$//g'; echo; echo "Ignore this")"

rm $TEMPFILE
rm $TEMPFILE.mem

This is pretty “rough and ready”.
It hasn't been code reviewed, or even looked at by anybody else.
I can safely say, there are no known bugs. LOL
But, we all know the rules. Any useful program will contain at least one variable, one loop and one bug.
I know where to find the variables and the loops.
I leave finding the bug as an exercise for the reader.

If you find it, please let me know.

  • YYYY-MM-DD
    • Initial release
  • monitors/zonemem.1375422447.txt.gz
  • Last modified: 2013/08/02 05:47
  • by vernon