Author | Vernon Everett |
Compatibility | Xymon 4.2 |
Requirements | It monitors zones, so I guess Solaris is pretty important. |
Download | None |
Last Update | 2013-08-02 |
I wanted a single unified view of zone memory utilsation, to assist with resource allocation improvements.
Then I added the swap utilisation, and then added graphing too, just for the hell of it.
Under certain circumstances, the graphing is useful, under others, not so much.
Then I added alerting for when thresholds were breached.
NOTE : Only runs in the global zone.
Client side
Copy script to $XYMONCLIENTHOME/ext in the normal way.
Make sure it's executable.
Server side
If you want graphing, add the following to xymonserver.cfg
SPLITNCV_zmem="*:GAUGE"
And add the following to TEST2RRD=
zmem=ncv
and GRAPHS=
zmem::4
also in xymonserver.cfg
Then add the following to graphs.cfg
[zmem]
FNPATTERN ^zmem,(.*).rrd
TITLE Zone %Memory Utilisation
YAXIS %
-l 0
-u 100
DEF:p@RRDIDX@=@RRDFN@:lambda:AVERAGE
LINE2:p@RRDIDX@#@COLOR@:@RRDPARAM@
GPRINT:p@RRDIDX@:LAST: \: %5.1lf (cur)
GPRINT:p@RRDIDX@:MAX: \: %5.1lf (max)
GPRINT:p@RRDIDX@:MIN: \: %5.1lf (min)
GPRINT:p@RRDIDX@:AVERAGE: \: %5.1lf (avg)\n
#!/usr/bin/ksh
if [ "$(/usr/bin/zonename)" != "global" ]
then
echo "I only run in global zones."
exit 1
fi
SWAPRED=90
SWAPYEL=75
MEMRED=95
MEMYEL=80
TEMPFILE=$XYMONTMP/zonemem.tmp
COLOUR=green
ALERT_HIGH_SWAP=true
ALERT_HIGH_MEM=true
typeset -L10 ZONE
typeset -R5 MEMCAP SWAPCAP CAP
typeset -R6 PCT_USED PCTSWAP
typeset -R8 MEMUSED MEMFREE TOTMEMFREE GLOBUSED RSS PCT_USEDG SWAP
# If we count the ZFS cache as free mem, then set this to true.
# If false, we will use the vmstat definition of free memory.
ZFSFREE=true
ALERT=1 # 1= alert on high usage
YELLOW=80
RED=90
INCLUDE_GLOBAL=true
# Now we redefine some variables, if they are set in clientlocal
LOGFETCH=${XYMONTMP}/logfetch.$(uname -n).cfg
if [ -f $LOGFETCH ]
then
grep "^ZONEMEM:" $LOGFETCH | cut -d":" -f2 \
| while read NEW_DEF
do
$NEW_DEF
done
fi
size_to_K ()
{
typeset -R1 UNIT
SIZE=$1
UNIT=$SIZE
case $UNIT in
K)
MULTIPLYER=1;;
M)
MULTIPLYER=1024;;
G)
MULTIPLYER=1048576;;
T)
MULTIPLYER=1073741824;;
esac
SIZE=${SIZE%?}
((SIZE=SIZE*MULTIPLYER))
echo $SIZE
}
size_K_to_G ()
{
# Or to M is we add an M parameter.
if [ "$2" = "M" ]
then
GSize=$(echo "$1/1024" | bc)M
else
GSize=$(echo "scale=3; $1/1048576+0.005" | bc)
GSize=${GSize%?}G
fi
echo $GSize$UNIT
}
INSTALLED_MEM=$(prtdiag -v | grep Memory)
GLOBMEM=$(echo $INSTALLED_MEM | awk '{ print $3"M" }')
GLOBMEM=$(size_to_K $GLOBMEM)
date > $TEMPFILE
echo >> $TEMPFILE
echo "Global $INSTALLED_MEM" | sed "s/size:/size - /g" >> $TEMPFILE
NUMZONES=$(zoneadm list | wc -l)
NUMZONES=$(echo $NUMZONES | sed 's/^[ \t]*//')
if [ "$ZFSFREE" = "true" ]
then
MEMFREE_GLOB=$( $XYMONCLIENTHOME/ext/getmemstat.ksh | egrep "^ZFS File Data|^Free" | cut -c18- | awk '{ sum += $2 } END { print sum }')
MEMFREE_GLOBK=$(size_to_K ${MEMFREE_GLOB}M)
else
MEMFREE_GLOBK=$(vmstat 2 2 | tail -1 | awk '{ print $5 }')
fi
echo " M E M O R Y | S W A P " >> $TEMPFILE
echo "Zone MemCap MemUsed% AvailMem UsedMem %of Global | SwapCap SwapUsed% SwapUsed " >> $TEMPFILE
prstat -n 1,10 -Z 1 2 | grep -v "^Total" | tail -${NUMZONES} | grep -v "global" | \
while read ID PROC SWAP RSS MEM TIME CPU ZONE
do
ALERT_TAG=""
MEMCAP=$(zonecfg -z $ZONE info capped-memory | grep physical | cut -d: -f2)
SWAPCAP=$(zonecfg -z $ZONE info capped-memory | grep swap | cut -d: -f2 | sed 's/]//g')
MEMUSED=$(size_to_K $RSS)
#Add the .005 to simulate rounding when we are going to be trunking.
PCT_USEDG=$(echo "scale =3; ((${MEMUSED}*100/${GLOBMEM})+0.005)" | bc)
PCT_USEDG=${PCT_USEDG%?}
if [ -z $MEMCAP ]
then
MEMFREE=$MEMFREE_GLOBK
PCT_USED=$PCT_USEDG
CAP="n/a"
MEMFREEL=$MEMFREE_GLOBK
PCT_USED=$PCT_USEDG
# No cap, we look at the total free.
else
# With a cap we need to compare total used to the cap
MEML=$(size_to_K $MEMCAP)
MEMFREEL=$(echo ${MEML}-${MEMUSED} | bc)
MEMFREE=$MEMFREEL
PCT_USED=$(echo "scale=3; (${MEMUSED}*100/${MEML})+0.005" | bc)
#Add the .005 to simulate rounding when we are going to be trunking.
PCT_USED=${PCT_USED%?}
CAP="$MEMCAP"
fi
TOTMEMFREE=$(size_K_to_G $MEMFREE M)
GSWAP=$(swap -s | awk '{ print $2 }' | sed "s/k$//g")
if [ -z $SWAPCAP ]
then
SWAPCAP="n/a"
TOTSWAPK=$GSWAP
# No cap, we look at the total free.
else
TOTSWAPK=$(size_to_K $SWAPCAP)
# With a cap we need to compare total used to the cap
fi
SWAPK=$(size_to_K $SWAP)
PCTSWAP=$(echo "scale=3; (${SWAPK}*100/${TOTSWAPK})+0.005" | bc)
PCTSWAP=${PCTSWAP%?}
if [ "$ALERT_HIGH_SWAP" = "true" ]
then
[ $PCTSWAP -gt $SWAPYEL ] && ALERT_TAG="yellow"
[ $PCTSWAP -gt $SWAPRED ] && ALERT_TAG="red"
fi
if [ "$ALERT_HIGH_MEM" = "true" ]
then
[ $PCT_USED -gt $MEMYEL -a "$ALERT_TAG" != "red" ] && ALERT_TAG="yellow"
[ $PCT_USED -gt $MEMRED ] && ALERT_TAG="red"
fi
[ ! -z "$ALERT_TAG" -a "$COLOUR" != "red" ] && COLOUR=$ALERT_TAG
[ ! -z "$ALERT_TAG" ] && ALERT_TAG="&$ALERT_TAG"
echo "$ZONE $CAP ${PCT_USED}% ${TOTMEMFREE} $RSS ${PCT_USEDG}% | $SWAPCAP ${PCTSWAP}% $SWAP $ALERT_TAG" >> $TEMPFILE.mem
done
sort $TEMPFILE.mem >> $TEMPFILE
$XYMON $XYMSRV "status $MACHINE.zmem $COLOUR $(cat $TEMPFILE)"
$XYMON $XYMSRV "data $MACHINE.zmem $(echo; cat $TEMPFILE.mem | grep -v "Global " | awk '{ print $1":"$3 }' | sed 's/\%$//g'; echo; echo "Ignore this")"
rm $TEMPFILE
rm $TEMPFILE.mem
Known Bugs and Issues
This is pretty “rough and ready”.
It hasn't been code reviewed, or even looked at by anybody else.
I can safely say, there are no known bugs.
But, we all know the rules. Any useful program will contain at least one variable, one loop and one bug.
I know where to find the variables and the loops.
I leave finding the bug(s) as an exercise for the reader.
If you find any, please let me know.
Find that last unknown bug.
All my own work.
Nobody else to blame.