Differences
This shows you the differences between two versions of the page.
Next revision | Previous revision | ||
monitors:zonemem [2013/08/02 05:23] – created vernon | monitors:zonemem [2013/08/02 05:50] (current) – [Known Bugs and Issues] vernon | ||
---|---|---|---|
Line 8: | Line 8: | ||
===== Description ===== | ===== Description ===== | ||
+ | I wanted a single unified view of zone memory utilsation, to assist with resource allocation improvements.\\ | ||
+ | Then I added the swap utilisation, | ||
+ | Under certain circumstances, | ||
+ | Then I added alerting for when thresholds were breached.\\ | ||
+ | **NOTE** : Only runs in the global zone. | ||
===== Installation ===== | ===== Installation ===== | ||
=== Client side === | === Client side === | ||
+ | Copy script to $XYMONCLIENTHOME/ | ||
+ | Make sure it's executable. | ||
=== Server side === | === Server side === | ||
+ | If you want graphing, add the following to xymonserver.cfg | ||
+ | SPLITNCV_zmem=" | ||
+ | |||
+ | And add the following to TEST2RRD= | ||
+ | zmem=ncv | ||
+ | |||
+ | and GRAPHS= | ||
+ | zmem::4 | ||
+ | also in xymonserver.cfg\\ | ||
+ | \\ | ||
+ | |||
+ | Then add the following to graphs.cfg | ||
+ | |||
+ | [zmem] | ||
+ | FNPATTERN ^zmem, | ||
+ | TITLE Zone %Memory Utilisation | ||
+ | YAXIS % | ||
+ | -l 0 | ||
+ | -u 100 | ||
+ | DEF: | ||
+ | LINE2: | ||
+ | GPRINT: | ||
+ | GPRINT: | ||
+ | GPRINT: | ||
+ | GPRINT: | ||
+ | | ||
===== Source ===== | ===== Source ===== | ||
==== myscript.sh ==== | ==== myscript.sh ==== | ||
Line 19: | Line 52: | ||
<hidden onHidden=" | <hidden onHidden=" | ||
< | < | ||
+ | # | ||
+ | if [ " | ||
+ | then | ||
+ | echo "I only run in global zones." | ||
+ | exit 1 | ||
+ | fi | ||
+ | SWAPRED=90 | ||
+ | SWAPYEL=75 | ||
+ | MEMRED=95 | ||
+ | MEMYEL=80 | ||
+ | TEMPFILE=$XYMONTMP/ | ||
+ | COLOUR=green | ||
+ | |||
+ | ALERT_HIGH_SWAP=true | ||
+ | ALERT_HIGH_MEM=true | ||
+ | |||
+ | typeset -L10 ZONE | ||
+ | typeset -R5 MEMCAP SWAPCAP CAP | ||
+ | typeset -R6 PCT_USED PCTSWAP | ||
+ | typeset -R8 MEMUSED MEMFREE TOTMEMFREE GLOBUSED RSS PCT_USEDG SWAP | ||
+ | # If we count the ZFS cache as free mem, then set this to true. | ||
+ | # If false, we will use the vmstat definition of free memory. | ||
+ | ZFSFREE=true | ||
+ | ALERT=1 | ||
+ | YELLOW=80 | ||
+ | RED=90 | ||
+ | INCLUDE_GLOBAL=true | ||
+ | |||
+ | # Now we redefine some variables, if they are set in clientlocal | ||
+ | LOGFETCH=${XYMONTMP}/ | ||
+ | if [ -f $LOGFETCH ] | ||
+ | then | ||
+ | grep " | ||
+ | | while read NEW_DEF | ||
+ | do | ||
+ | | ||
+ | done | ||
+ | fi | ||
+ | |||
+ | |||
+ | size_to_K () | ||
+ | { | ||
+ | typeset -R1 UNIT | ||
+ | SIZE=$1 | ||
+ | UNIT=$SIZE | ||
+ | case $UNIT in | ||
+ | K) | ||
+ | MULTIPLYER=1;; | ||
+ | M) | ||
+ | MULTIPLYER=1024;; | ||
+ | G) | ||
+ | MULTIPLYER=1048576;; | ||
+ | T) | ||
+ | MULTIPLYER=1073741824;; | ||
+ | esac | ||
+ | SIZE=${SIZE%? | ||
+ | ((SIZE=SIZE*MULTIPLYER)) | ||
+ | echo $SIZE | ||
+ | } | ||
+ | |||
+ | size_K_to_G () | ||
+ | { | ||
+ | # Or to M is we add an M parameter. | ||
+ | if [ " | ||
+ | then | ||
+ | GSize=$(echo " | ||
+ | else | ||
+ | GSize=$(echo " | ||
+ | GSize=${GSize%? | ||
+ | fi | ||
+ | echo $GSize$UNIT | ||
+ | } | ||
+ | |||
+ | INSTALLED_MEM=$(prtdiag -v | grep Memory) | ||
+ | GLOBMEM=$(echo $INSTALLED_MEM | awk '{ print $3" | ||
+ | GLOBMEM=$(size_to_K $GLOBMEM) | ||
+ | |||
+ | date > $TEMPFILE | ||
+ | echo >> $TEMPFILE | ||
+ | echo " | ||
+ | |||
+ | NUMZONES=$(zoneadm list | wc -l) | ||
+ | NUMZONES=$(echo $NUMZONES | sed 's/^[ \t]*//' | ||
+ | if [ " | ||
+ | then | ||
+ | MEMFREE_GLOB=$( $XYMONCLIENTHOME/ | ||
+ | MEMFREE_GLOBK=$(size_to_K ${MEMFREE_GLOB}M) | ||
+ | else | ||
+ | MEMFREE_GLOBK=$(vmstat 2 2 | tail -1 | awk '{ print $5 }') | ||
+ | fi | ||
+ | |||
+ | |||
+ | echo " | ||
+ | echo " | ||
+ | prstat -n 1,10 -Z 1 2 | grep -v " | ||
+ | while read ID PROC SWAP RSS MEM TIME CPU ZONE | ||
+ | do | ||
+ | | ||
+ | | ||
+ | | ||
+ | | ||
+ | #Add the .005 to simulate rounding when we are going to be trunking. | ||
+ | | ||
+ | | ||
+ | if [ -z $MEMCAP ] | ||
+ | then | ||
+ | MEMFREE=$MEMFREE_GLOBK | ||
+ | PCT_USED=$PCT_USEDG | ||
+ | CAP=" | ||
+ | MEMFREEL=$MEMFREE_GLOBK | ||
+ | PCT_USED=$PCT_USEDG | ||
+ | # No cap, we look at the total free. | ||
+ | else | ||
+ | # With a cap we need to compare total used to the cap | ||
+ | MEML=$(size_to_K $MEMCAP) | ||
+ | MEMFREEL=$(echo ${MEML}-${MEMUSED} | bc) | ||
+ | MEMFREE=$MEMFREEL | ||
+ | PCT_USED=$(echo " | ||
+ | #Add the .005 to simulate rounding when we are going to be trunking. | ||
+ | PCT_USED=${PCT_USED%? | ||
+ | CAP=" | ||
+ | fi | ||
+ | | ||
+ | |||
+ | | ||
+ | if [ -z $SWAPCAP ] | ||
+ | then | ||
+ | SWAPCAP=" | ||
+ | TOTSWAPK=$GSWAP | ||
+ | # No cap, we look at the total free. | ||
+ | else | ||
+ | TOTSWAPK=$(size_to_K $SWAPCAP) | ||
+ | # With a cap we need to compare total used to the cap | ||
+ | fi | ||
+ | | ||
+ | | ||
+ | | ||
+ | if [ " | ||
+ | then | ||
+ | [ $PCTSWAP -gt $SWAPYEL ] && ALERT_TAG=" | ||
+ | [ $PCTSWAP -gt $SWAPRED ] && ALERT_TAG=" | ||
+ | fi | ||
+ | if [ " | ||
+ | then | ||
+ | [ $PCT_USED -gt $MEMYEL -a " | ||
+ | [ $PCT_USED -gt $MEMRED ] && ALERT_TAG=" | ||
+ | fi | ||
+ | [ ! -z " | ||
+ | [ ! -z " | ||
+ | echo "$ZONE $CAP ${PCT_USED}% ${TOTMEMFREE} $RSS | ||
+ | done | ||
+ | sort $TEMPFILE.mem >> $TEMPFILE | ||
+ | $XYMON $XYMSRV " | ||
+ | $XYMON $XYMSRV "data $MACHINE.zmem $(echo; cat $TEMPFILE.mem | grep -v " | ||
+ | |||
+ | rm $TEMPFILE | ||
+ | rm $TEMPFILE.mem | ||
+ | |||
+ | |||
</ | </ | ||
</ | </ | ||
===== Known Bugs and Issues ===== | ===== Known Bugs and Issues ===== | ||
+ | This is pretty "rough and ready" | ||
+ | It hasn't been code reviewed, or even looked at by anybody else.\\ | ||
+ | I can safely say, there are no **known** bugs. LOL \\ | ||
+ | But, we all know the rules. Any useful program will contain at least one variable, one loop and one bug.\\ | ||
+ | I know where to find the variables and the loops. \\ | ||
+ | I leave finding the bug(s) as an exercise for the reader.\\ | ||
+ | If you find any, please let me know. | ||
===== To Do ===== | ===== To Do ===== | ||
+ | Find that last unknown bug. | ||
===== Credits ===== | ===== Credits ===== | ||
+ | All my own work.\\ | ||
+ | Nobody else to blame. | ||
===== Changelog ===== | ===== Changelog ===== | ||
- | * **YYYY-MM-DD** | + | * **2013-08-02** |
* Initial release | * Initial release | ||