====== Xymon Client CPU Load Calculator ====== ^ Author | [[ howe.bill@gmail.com | Bill Howe ]] | ^ Compatibility | Xymon 4.3 | ^ Requirements | 'bc' package | ^ Download | None | ^ Last Update | 2016-11-25 | ===== Description ===== Calculate (and set) a Xymon client's cpu load thresholds, based off of the client's reported number of processors from the hostdata files. This is meant to run on the Xymon Server periodically with cron. It allows for far more useful load monitoring than arbitrarily setting a generic load or spending a lot of time editing config files for each system. ===== Installation ===== Installation instructions. === Client side === No modifications required. === Server side === - Install the 'bc' package (allows bash string to floating point number conversion) - Enterprise Linux 6/7yum install bc - Create the cpu-load-calc.sh script somewhere such as: /root/scripts/cpu-load-calc.sh - See source below for contents - Edit the "Customize Here" section - Edit multipliers if desired(number of procs * number for warning and critical CPU load thresholds) - Create the auto load directory on the Xymon server - Default: /etc/xymon/analysis.d/auto-cpuload.d - Add the auto load directory to the Xymon main analysis.cfg file so it is included - Default: "directory /etc/xymon/analysis.d/auto-cpuload.d" to /etc/xymon/analysis.cfg - Set the default load in /etc/xymon/analysis.cfg to a super low warning level of "0.1" and a critical of something very high, such as "100.0". - That way, a hostdata file is generated for a system that has not had its load auto calculated. (and you don't get alert emails/texts as its only at warning) - Setup the script to auto run via cron (-v is verbose output) - Example/etc/cron.hourly/xymon-calc-cpuload.sh #!/bin/bash # Description: Execute xymon script to auto calculate cpu load from hostdata /root/scripts/cpu-load-calc.sh -v &> /root/scripts/cpu-load-calc.log ===== Source ===== Expand source code below. Create /root/scripts/cpu-load-calc.sh: ==== cpu-load-calc.sh ==== #!/bin/bash # Title: cpu-load-calc.sh # Description: Calculate a xymon client's cpu load (Run on Xymon Server periodically with cron) # Dependency: Requires 'bc' package # Last Change: 2018-05-22 # Recent Changes:-Updated awk search to look for [nproc] at the beginning of the line #======================= # Customize Here #======================= # Warning and Critical Load Multipliers (num of procs * multiplier) load_warn_multiplier=1.0 load_crit_multiplier=1.5 # Directory to save auto load thresholds auto_load_dir="/etc/xymon/analysis.d/auto-cpuload.d" # Xymon server's hostdata directory xymon_hostdata_dir="/var/lib/xymon/hostdata" # Xymon server's main analysis config file xymon_analysis_cfg="/etc/xymon/analysis.cfg" #======================= # End of Customize #======================= #======================= # Pre-Run Error Checking #======================= ## Dependency Check ## which bc &> /dev/null if [[ $? -eq 1 ]]; then echo ">> Error! Dependent package 'bc' (byte code) not detected. Exiting..." exit 1 fi ## Does the Auto Load Directory exist? if [[ ! -d ${auto_load_dir} ]]; then echo ">> Error! The directory (${auto_load_dir}) does not exist or is not a directory. Exiting..." exit fi ## Write Access Check touch ${auto_load_dir}/testfile &> /dev/null if [[ $? -eq 1 ]]; then echo ">> Error! User '$(whoami)' does not have write access to ${auto_load_dir}! Exiting..." exit 1 else rm -f ${auto_load_dir}/testfile &> /dev/null fi ## Check if the auto_load_dir is included in main analysis config file grep "^directory ${auto_load_dir}" ${xymon_analysis_cfg} &> /dev/null if [[ $? -eq 1 ]]; then echo -e ">> Warning! Auto load directory (${auto_load_dir}) is not included in ${xymon_analysis_cfg}. Continuing, but auto CPU load settings will not take affect until 'directory ${auto_load_dir}' is added to ${xymon_analysis_cfg}.\n" fi #======================= # End of Pre-Run Error Checking #======================= #=============================== # Functions; Main starts after #=============================== function show_usage { echo -e "\n####==== Xymon Client Auto Load Thresholds ====####" echo -e "\nDescripton: Calculate a xymon client's cpu load." echo -e "\n--Usage" echo -e "$0 => No arguments, configure with no verbosity." echo -e "$0 -v => Verbose output." echo -e "$0 -r => Refresh CPU load data (force hostdata update)." echo -e "$0 -h => Display usage." } # Force snapshots of hostdata function force_hostdata { # Use node name passed as argument node_name=${1} # Lie to Xymon that the node's cpu is green, then yellow, forcing a hostdata snapshot xymon 127.0.0.1 "status ${node_name}.cpu green $(date)" xymon 127.0.0.1 "status ${node_name}.cpu yellow $(date)" } #======================= # Get Script Arguments #======================= # Reset POSIX variable in case it has been used previously in this shell OPTIND=1 # By default, no verbose output verbose_output="no" refresh_cpus="no" while getopts "hrv" opt; do case "${opt}" in h) # -h (help) argument show_usage exit 0 ;; r) # -r (refersh cpus) argument refresh_cpus="yes" ;; v) # -v (verbose) argument verbose_output="yes" ;; *) # invalid argument show_usage exit 0 ;; esac done #======================= # Main Program #======================= echo -e "== Xymon Client Auto Load Thresholds ==" echo -e "Load Warning Multiplier: ${load_warn_multiplier}" echo -e "Load Critical Multiplier: ${load_crit_multiplier}" echo -e "Saving configs to: ${auto_load_dir}" # For each node reporting host data for node in $(ls ${xymon_hostdata_dir}); do if [[ ${verbose_output} == "yes" ]]; then echo -e "\n>> Working on node: ${node}" fi if [[ ${refresh_cpus} == "yes" ]]; then if [[ ${verbose_output} == "yes" ]]; then echo -e "\n-> Refreshing hostdata..." fi # Force an update of hostdata force_hostdata ${node} fi # Get the number of procs reported from node's most recent host data file node_num_procs="$(cat ${xymon_hostdata_dir}/${node}/$(ls -tr ${xymon_hostdata_dir}/${node}/ | tail -1) | awk '/^\[nproc]/ { getline; print }')" # If node_num_procs is empty or not a number, move to the next node if [[ -z ${node_num_procs} || ! ${node_num_procs} =~ [0-9][0-9]* ]]; then # Did not find 'nproc' in the host data file or no number from nproc returned if [[ ${verbose_output} == "yes" ]]; then echo "-> Warning! Could not find 'nproc' in ${node}'s host data file or no number returned. Skipping..." fi continue fi # Calculate the warning and critical load thresholds (normalize as a floating point with bc) load_warning=$(echo "${node_num_procs} * ${load_warn_multiplier}" | bc) load_critical=$(echo "${node_num_procs} * ${load_crit_multiplier}" | bc) if [[ ${verbose_output} == "yes" ]]; then echo -e "-> Number of Procs: ${node_num_procs}" echo -e "-> Warning at: ${load_warning}" echo -e "-> Critical at: ${load_critical}" echo -e "-> Creating node analysis drop in file..." fi # Create analysis drop in file echo "# ${node}'s CPU Load Thresholds (Warning Critical)" > ${auto_load_dir}/${node}.cfg echo "HOST=${node}" >> ${auto_load_dir}/${node}.cfg echo " LOAD ${load_warning} ${load_critical}" >> ${auto_load_dir}/${node}.cfg done echo -e "\n== Auto Load Thresholds Complete ==" exit 0 ===== Known Bugs and Issues ===== * Enterprise Linux 5 does not report number of procs in hostdata. Those systems will need to be set manually in analysis.cfg files. ===== To Do ===== ===== Credits ===== ===== Changelog ===== * **2018-05-22** * Updated awk search to look for [nproc] at the beginning of the line. * **2016-12-28** * Added green status in force_hostdata function to improve refresh logic. * **2016-11-25** * Initial release - Moved all information from Xymon mailing lists to here.