Mon Aug 29 21:53:18 EEST 2011

Monitoring Windows hosts with Nagios on Debian GNU/Linux

1. Install necessery nagios debian packages

apt-get install nagios-images nagios-nrpe-plugin nagios-nrpe-server nagios-plugins nagios-plugins-basic nagios-plugins-standard \
nagios3 nagios3-cgi nagios3-common nagios3-core


2. Edit /etc/nagios-plugins/config/nt.cfg

In the File substitute:

define command { command_name check_nt command_line /usr/lib/nagios/plugins/check_nt -H '$HOSTADDRESS$' -v '$ARG1$' }


With:

define command {
command_name check_nt
command_line /usr/lib/nagios/plugins/check_nt -H '$HOSTADDRESS$' -p 12489 -v $ARG1$ $ARG2$
}


3. Modify nrpe.cfg to put in allowd hoss to connect to the Nagions nrpe server

vim /etc/nagios/nrpe.cfg

Lookup inside for nagios's configuration directive:

allowed_hosts=127.0.0.1


In order to allow more hosts to report to the nagios nrpe daemon, change the value to let's say:

allowed_hosts=127.0.0.1,192.168.1.4,192.168.1.5,192.168.1.6


This config allows the three IPs 192.168.1.4-6 to be able to report for nrpe.

For the changes to nrpe server to take effect, it has to be restrarted.

debian:~# /etc/init.d/nagios-nrpe-server restart


Further on some configurations needs to be properly done on the nrpe agent Windows hosts in this case 192.168.1.4,192.168.1.5,192.168.1.6

4. Install the nsclient++ on all Windows hosts which CPU, Disk, Temperature and services has to be monitored

Download the agent from http://sourceforge.net/projects/nscplus and launch the installer, click twice on it and follow the installation screens. Its necessery that during installation the agent has the NRPE protocol enabled. After the installation is complete one needs to modify the NSC.ini
By default many of nsclient++ tracking modules are not enabled in NSC.ini, thus its necessery that the following DLLs get activated in the conf:

FileLogger.dll
CheckSystem.dll
CheckDisk.dll
NSClientListener.dll
SysTray.dll
CheckEventLog.dll
CheckHelpers.dll


Another requirement is to instruct the nsclient++ angent to have access to the Linux installed nagios server again with adding it to the allowed_hosts config variable:

allowed_hosts=192.168.1.1


In my case the Nagios runs on Debian Lenny (Squeeze) 6 and possess the IP address of 192.168.1.1
To test the intalled windows nsclient++ agents are properly installed a simple telnet connection from the Linux host is enough:

5. Create necessery configuration for the nagios Linux server to include all the Windows hosts which will be monitored

There is a window.cfg template file located in /usr/share/doc/nagios3-common/examples/template-object/windows.cfg on Debian.

The file is a good start point for creating a conf file to be understand by nagios and used to periodically refresh information about the status of the Windows hosts.

Thus it's a good idea to copy the file to nagios3 config directory:

debian:~# mkdir /etc/nagios3/objects
debian:~# cp -rpf /usr/share/doc/nagios3-common/examples/template-object/windows.cfg /etc/nagios3/objects/windows.cfg


A sample windows.cfg content, (which works for me fine) and monitor a couple of Windows nodes running MS-SQL service and IIS and makes sure the services are up and running are:

define host{
use windows-server ; Inherit default values from a template
host_name Windows1 ; The name we're giving to this host
alias Iready Server ; A longer name associated with the host
address 192.168.1.4 ; IP address of the host
}
define host{
use windows-server ; Inherit default values from a template
host_name Windows2 ; The name we're giving to this host
alias Iready Server ; A longer name associated with the host
address 192.168.1.4 ; IP address of the host
}
define hostgroup{
hostgroup_name windows-servers ; The name of the hostgroup
alias Windows Servers ; Long name of the group
}
define hostgroup{
hostgroup_name IIS
alias IIS Servers
members Windows1,Windows2
}
define hostgroup{
hostgroup_name MSSQL
alias MSSQL Servers
members Windows1,Windows2
}
define service{
use generic-service
host_name Windows1
service_description NSClient++ Version
check_command check_nt!CLIENTVERSION
}
define service{ use generic-service
host_name Windows1
service_description Uptime
check_command check_nt!UPTIME
}
define service{ use generic-service
host_name Windows1
service_description CPU Load
check_command check_nt!CPULOAD!-l 5,80,90
}
define service{
use generic-service
host_name Windows1
service_description Memory Usage
check_command check_nt!MEMUSE!-w 80 -c 90
define service{
use generic-service
host_name Windows1
service_description C:\ Drive Space
check_command check_nt!USEDDISKSPACE!-l c -w 80 -c 90
}
define service{
use generic-service
host_name Windows1
service_description W3SVC
check_command check_nt!SERVICESTATE!-d SHOWALL -l W3SVC
}
define service{
use generic-service
host_name Windows1
service_description Explorer
check_command check_nt!PROCSTATE!-d SHOWALL -l Explorer.exe
}
define service{
use generic-service
host_name Windows2
service_description NSClient++ Version
check_command check_nt!CLIENTVERSION
}
define service{ use generic-service
host_name Windows2
service_description Uptime
check_command check_nt!UPTIME
}
define service{ use generic-service
host_name Windows2
service_description CPU Load
check_command check_nt!CPULOAD!-l 5,80,90
}
define service{
use generic-service
host_name Windows2
service_description Memory Usage
check_command check_nt!MEMUSE!-w 80 -c 90
define service{
use generic-service
host_name Windows2
service_description C:\ Drive Space
check_command check_nt!USEDDISKSPACE!-l c -w 80 -c 90
}
define service{
use generic-service
host_name Windows2
service_description W3SVC
check_command check_nt!SERVICESTATE!-d SHOWALL -l W3SVC
}
define service{
use generic-service
host_name Windows2
service_description Explorer
check_command check_nt!PROCSTATE!-d SHOWALL -l Explorer.exe
}
define service{ use generic-service
host_name Windows1
service_description SQL port Check
check_command check_tcp!1433
}
define service{
use generic-service
host_name Windows2
service_description SQL port Check
check_command check_tcp!1433
}
The above config, can easily be extended for more hosts, or if necessery easily setup to track more services in nagios web frontend.
6. Test if connectivity to the nsclient++ agent port is available from the Linux server