Changes between Version 1 and Version 2 of OMDeventhandlers

Show
Ignore:
Timestamp:
10/04/11 12:16:17 (8 years ago)
Author:
jonmills (IP: 152.54.8.104)
Comment:

--

Legend:

Unmodified
Added
Removed
Modified
  • OMDeventhandlers

    v1 v2  
     1== Overview == 
     2 
     3The challenge in monitoring an environment like a Eucalyptus cluster is that it is always changing.  Virtual machines are created and destroyed all the time.  When virtual machines are running, we want to monitor them.  When they no longer exist, we want to stop monitoring them.  And most of all, we don't want to constantly alter the configuration of our monitoring system by hand to add and remove these hosts and their affiliated checks.  This is where OMD shines, because we can combine the utility of Nagios eventhandlers with the ability of Check_MK to (re-)inventory hosts, rebuild Nagios object configuration, and reload Nagios.  The result is a dynamic system that always knows what to monitor, and what not to monitor. 
     4 
     5 
     6== Check_MK inventory Eventhandler == 
     7 
     8The first step is to set up an eventhandler that can respond to a situation in which the service check "Check_MK inventory" discovers a new service. 
     9 
    110{{{ 
     11extra_nagios_conf += r""" 
    212 
    3 extra_nagios_conf += r""" 
     13# Defines an eventhandler (a Nagios command) that will run when service_description "Check_MK inventory" discovers  
     14# new things on a host that it can monitor.  The purpose is to automatically reconfigure Check_MK 
     15# to monitor those newly-discovered services. 
    416define command { 
    517    command_name    cmk_reinventory 
     
    719} 
    820""" 
     21 
     22# Map the eventhandler command to the service definition in Nagios 
     23extra_service_conf["event_handler"] = [ 
     24        ( "cmk_reinventory", ALL_HOSTS, ["Check_MK inventory"]), 
     25] 
     26# Enable eventhandlers in Nagios for the service definition 
     27extra_service_conf["event_handler_enabled"] = [ 
     28        ( "1", ALL_HOSTS, ["Check_MK inventory"]), 
     29] 
     30}}} 
     31 
     32 
     33== Host Check Eventhandler == 
     34 
     35 * In Nagios, a Host Check is always a ping check, and the responses are UP or DOWN depending on whether the host could be reached. 
     36 * We want to define an eventhandler that is triggered by the DOWN state of a host, but only for hosts with the Check_MK tag 'vm' 
     37 * If the host has a 'vm' tag, and is in a DOWN state, and is no longer listed as 'running' or 'pending' by `euca-describe-instances`, then we want to remove it from Check_MK's hosts.mk & ipaddresses.mk files, and reload Check_MK & Nagios 
     38 
     39{{{ 
    940extra_nagios_conf += r""" 
    1041define command { 
     
    1344} 
    1445""" 
    15  
    16  
    17 extra_service_conf["event_handler"] = [ 
    18         ( "cmk_reinventory", ALL_HOSTS, ["Check_MK inventory"]), 
    19 ] 
    20 extra_service_conf["event_handler_enabled"] = [ 
    21         ( "1", ALL_HOSTS, ["Check_MK inventory"]), 
    22 ] 
    23  
    24  
    2546extra_host_conf["event_handler"] = [ 
    2647        ( "del_vm", [ "vm" ], ALL_HOSTS ), 
     
    3051] 
    3152}}} 
     53 
     54=== 'del_vm' eventhandler script === 
     55{{{ 
     56#!/bin/bash 
     57# 
     58# Event handler script for re-inventorying a host when the 
     59# "Check_MK Inventory" check comes back telling you that there 
     60# are unchecked services on a host. 
     61 
     62export PATH="/omd/sites/nagios/lib/perl5/bin:/omd/sites/nagios/local/bin:/omd/sites/nagios/bin:/omd/sites/nagios/local/lib/perl5/bin:/usr/kerberos/bin:/usr/local/bin:/bin:/usr/bin:/opt/local/bin:/opt/local/sbin" 
     63 
     64# These are bash args brought in from the command line 
     65LOG=/tmp/del_vm.sh 
     66HOSTNAME=$1 
     67HOSTSTATE=$2 
     68 
     69case "$HOSTSTATE" in 
     70 
     71UP) 
     72        # Do nothing on ok 
     73        ;; 
     74 
     75DOWN) 
     76 
     77        # We need to verify the $INSTANCE is gone, using euca-describe-instances 
     78        RESULT=$(euca-describe-instances | egrep '(running|pending)' | grep ${HOSTNAME} >/dev/null; echo $?) 
     79        if [ $RESULT = 1 ]; then 
     80 
     81        #Logging... 
     82        touch $LOG 
     83        echo $0 > $LOG 
     84        echo `date` >> $LOG 
     85        echo "HOSTNAME is $HOSTNAME" >> $LOG 
     86        echo "HOSTSTATE is $HOSTSTATE" >> $LOG 
     87        echo " " >> $LOG 
     88 
     89        # Clean up cmk 
     90        echo "Running cmk --flush $HOSTNAME" >> $LOG 
     91        ${OMD_ROOT}/bin/cmk --flush $HOSTNAME >> $LOG 
     92 
     93        # Remove the VM from Check_MK 
     94        echo "Removing $HOSTNAME from hosts.mk" >> $LOG 
     95        /bin/sed -i '/'$HOSTNAME'/ d' ${OMD_ROOT}/etc/check_mk/conf.d/hosts.mk >> $LOG 
     96        echo "Removing $HOSTNAME from ipaddresses.mk" >> $LOG 
     97        /bin/sed -i '/'$HOSTNAME'/ d' ${OMD_ROOT}/etc/check_mk/conf.d/ipaddresses.mk >> $LOG 
     98 
     99        # Now re-inventory && reload 
     100        echo "Running cmk -IIu" >> $LOG 
     101        ${OMD_ROOT}/bin/cmk -IIu >> $LOG 
     102        echo "Running cmk -O" >> $LOG 
     103        ${OMD_ROOT}/bin/cmk -O >> $LOG 
     104 
     105        fi 
     106        ;; 
     107 
     108esac 
     109 
     110exit 0 
     111}}}