Archive for the ‘Monitoring’ Category

How to Update / Migrate zabbix-agent 5 to zabbix-agent2 6 on Redhat / CentOS / Fedora Linux

Friday, August 9th, 2024

Upgrade-zabbix-agent1-5-to-zabbix-agent2-6-on-RHEL-CentOS-Fedora-Linux-howto-logo

If you have servers reporting monitoring with Zabbix running still on Zabbix-Agent 1 version 5.0.X but already migrated the Zabbix-server to Zabbix 6, it is a good idea to update the Agent to Zabbix Agent 6 As sson as possible, as you know lacking behind in version makes updating harder and more complicated task.

Mine and I guess most system administrators experience points that Keeping at the same level of versioning on many applications historically has shown to reduce unexpected errors and bugs but nowadays, the rule of keeping local and remote application ( programs )  at the same version level is regularly broken.

Theoretically Zabbix-Agent (Client) and Zabbix (Server) has a compitability for a certain range of versions (Zabbix agents 2 from version 4.4 onwards are compatible with Zabbix 7.0; Zabbix agent 2 must not be newer than 7.0 – for more on zabbix agent – > server version compitability check here ) and having a slight version difference should not be really a problem but often you might have a third party proxies in between such as haproxy or zabbix-proxy or other network oddities and thus my personal opinion is that for interoperability it is better to keep the Zabbix Clients and Zabbix Servers across the DMZ-ed networks running at same version level.

Some would say I have an old fashion thinking as software and technology is moving forward, but as I see how programming code writing and even software is constantly degradating just a reflection of degradation of human element, I prefer to keep my old know how and always stick to same versioning whenever possible.

Some would wonder then why would I upgrade to Zabbix-agent2 ? , if have to keep the same versioning, the reason is zabbix-agent2 is written in GO Language and is much faster and supposably better piece of software than Zabbix Agent1 that is written in Python.

Moreover having Zabbix agent 2 instead of 1 gives also benefits as you can do a bit more with zabbix and on the other hand the machines are more ready for monitoring in terms of future. To know more about the Benefits of Zabbix Agent2 compared to Zabbix Agent 1 read the Agent vs Agent2 comparison on zabbix website.

 

With this little introduction, lets proceed with the exact steps to take to upgrade zabbix-agent1 to zabbix-agent2.

1. Check the current installed Zabbix-Agent version 

[user@monitored-server ~]$ rpm -qa |grep -i zabb
zabbix-get-5.0.42-1.el8.x86_64
zabbix-sender-5.0.42-1.el8.x86_64
zabbix-agent-5.0.42-1.el8.x86_64

[user@server ~]$ 

 

2. Create backup copy of current system working zabbix_agentd.conf
 

Before messing up with the working zabbix-agent as usual create the necessery backup to prevent later suprises

[user@monitored-server ~]$ cp -vrpf /etc/zabbix/zabbix_agentd.conf /etc/zabbix/zabbix_agentd.conf.bak-$(date '+%Y-%m-%d_%H-%M-%S')

3. Check current configured Zabbix repos

 

[user@monitored-server ~]$ vim /etc/yum.repos.d/zabbix.repo
 

[zabbix-4.0]
name = zabbix-4.0 – 8
baseurl = http://zabbix-repo-server.com/external/zabbix-4.0/8/$basearch
enabled = 0
gpgkey = http://zabbix-repo-server.com/external/zabbix-4.0/zabbix-official-repo.key
gpgcheck = 1

[zabbix-4.4]
name = zabbix-4.4 – 8
baseurl = http://zabbix-repo-server.com/external/zabbix-4.4/8/$basearch
enabled = 0
gpgkey = http://zabbix-repo-server.com/external/zabbix-4.4/zabbix-official-repo.key
gpgcheck = 1

[zabbix-5.0]
name = zabbix-5.0 – 8
baseurl = http://zabbix-repo-server.com/external/zabbix-5.0/8/$basearch
enabled = 1
gpgkey = http://zabbix-repo-server.com/external/zabbix-5.0/zabbix-official-repo.key
gpgcheck = 1

[zabbix-5.4]
name = zabbix-5.4 – 8
baseurl = http://zabbix-repo-server.com/external/zabbix-5.4/8/$basearch
enabled = 0
gpgkey = http://zabbix-repo-server.com/external/zabbix-5.4/zabbix-official-repo.key
gpgcheck = 1

[zabbix-6.0]
name = zabbix-6.0 – 8
baseurl = http://zabbix-repo-server.com/external/zabbix-6.0/8/$basearch
enabled = 0
gpgkey = http://zabbix-repo-server.com/external/zabbix-6.0/zabbix-official-repo.key
gpgcheck = 1


4. Modify repositories and include the Zabbix Agent6 yum repos 
 

[user@monitored-server ~]$ cp -rpf zabbix.repo zabbix.repo.5.0.rpmsave

As we want to keep only the 6.0 version, leave only the zabbix-6.0 section and enable the repo:
 

[user@monitored-server ~]$ vim /etc/yum.repos.d/zabbix.repo

[zabbix-6.0]
name = zabbix-6.0 – 8
baseurl = http://zabbix-repo-server.com/external/zabbix-6.0/8/$basearch
enabled = 1
gpgkey = http://zabbix-repo-server.com/external/zabbix-6.0/zabbix-official-repo.key
gpgcheck = 1


5. Update zabbix-agent to zabbix-agent2 and update zabbix-get zabbix-sender versions

To not disrupt reported monitoring for zabbix-agent, don't delete zabbix-agent1 but instead in pararallel install and configure
zabbix-agent2 and then once configuration is migrated from Agent 1 to 2, stop the old zabbix-agent and bring up the new one.

[user@monitored-server ~]$ yum check-update

[user@monitored-server ~]$ yum install zabbix-agent2 zabbix-get zabbix-sender -y

Note that if you want to have a precise version number of zabbix-agent that is lets say 6.0.31 to correspond to zabbix-server 6.0.31 (even though in the repositories newer RPM versions are available), run:
 

[user@monitored-server ~]$ yum upgrade zabbix-agent2-6.0.31-release1.el8

 

  • Check new zabbix_agent2 installed version 


# zabbix_agent2 -V
zabbix_agent2 (Zabbix) 6.0.31
Revision b6d93755a1b 17 June 2024, compilation time: {undefined} {undefined}, built with: go1.21.3
Plugin communication protocol version is 6.0.13

Copyright (C) 2024 Zabbix SIA
License GPLv2+: GNU GPL version 2 or later <https://www.gnu.org/licenses/>.
This is free software: you are free to change and redistribute it according to
the license. There is NO WARRANTY, to the extent permitted by law.

This product includes software developed by the OpenSSL Project
for use in the OpenSSL Toolkit (http://www.openssl.org/).

Compiled with OpenSSL 1.1.1k  FIPS 25 Mar 2021
Running with OpenSSL 1.1.1k  FIPS 25 Mar 2021

We use the library Eclipse Paho (eclipse/paho.mqtt.golang), which is
distributed under the terms of the Eclipse Distribution License 1.0 (The 3-Clause BSD License)
available at https://www.eclipse.org/org/documents/edl-v10.php

We use the library go-modbus (goburrow/modbus), which is
distributed under the terms of the 3-Clause BSD License
available at https://github.com/goburrow/modbus/blob/master/LICENSE

 

6. Migrate old /etc/zabbix/zabbix_agentd.conf to /etc/zabbix/zabbix-agent2.conf

For readability to show the main configured variables for zabbix-agent without the tons of comments, to later include in agent2
 

[root@monitored-server ~]# cat /etc/zabbix/zabbix_agentd.conf | grep -v '\#' | sed '/^$/d' 
PidFile=/var/run/zabbix/zabbix_agentd.pid
LogFile=/var/log/zabbix/zabbix_agentd.log
LogFileSize=0
Server=10.50.37.8,127.0.0.1
ServerActive=10.50.37.8,127.0.0.1
Hostname=fqdn-of-monitored-host.domain.com
Timeout=20
Include=/etc/zabbix/zabbix_agentd.d/*.conf

The default zabbix-agent2 installed config would like similar to:

[root@monitored-server ~]# cat /etc/zabbix/zabbix_agent2.conf | grep -v '\#' | sed '/^$/d'
PidFile=/run/zabbix/zabbix_agent2.pid
LogFile=/var/log/zabbix/zabbix_agent2.log
LogFileSize=0
Server=127.0.0.1
# Specify the location of the Zabbix server host.
ServerActive=127.0.0.1
Hostname=Zabbix server
Include=/etc/zabbix/zabbix_agent2.d/*.conf
PluginSocket=/run/zabbix/agent.plugin.sock
ControlSocket=/run/zabbix/agent.sock
Include=./zabbix_agent2.d/plugins.d/*.conf

The new migrate one, should be like:

[root@monitored-server ~]# vim /etc/zabbix/zabbix_agent2.conf
PidFile=/run/zabbix/zabbix_agent2.pid
LogFile=/var/log/zabbix/zabbix_agent2.log
LogFileSize=10
Server=10.34.89.7,127.0.0.1
ServerActive=10.34.89.7,127.0.0.1
Hostname=lqgblu02f.ffm.de.int.atosorigin.com
Timeout=20
Include=/etc/zabbix/zabbix_agent2.d/*.conf
PluginSocket=/run/zabbix/agent.plugin.sock
ControlSocket=/run/zabbix/agent.sock
Include=/etc/zabbix/zabbix_agent2.d/plugins.d/*.conf


7. Add few Optimization variables for better zabbix-server -> zabbix-proxy -> zabbix-server interactions 

If you have sometimes a network delays between zabbix server -> zabbix client and vice versa (depending on whether Zabbix agent is configured as Active or Passive mode), it is often useful 
to add those 2 variables:

# How often list of active checks is refreshed, in seconds
RefreshActiveChecks=60
# Refresh the active checks on start.ForceActiveChecksOnStart=1
ForceActiveChecksOnStart=1


Also it might be a good practice to add zabbix_agent2.log monitoring with the agent itself, if the log exceeds certain amount, instead of calling it via logrotate.
 

# Perform log file rotation at the 1 MB point for the specified filepath
LogFileSize=1

 

[root@monitored-server ~]# vim /etc/zabbix/zabbix_agent2.conf
PidFile=/run/zabbix/zabbix_agent2.pid
LogFile=/var/log/zabbix/zabbix_agent2.log
LogFileSize=10
Server=10.34.89.7,127.0.0.1
ServerActive=10.34.89.7,127.0.0.1
Hostname=lqgblu02f.ffm.de.int.atosorigin.com
RefreshActiveChecks=60
ForceActiveChecksOnStart=1
Timeout=20
Include=/etc/zabbix/zabbix_agent2.d/*.conf
PluginSocket=/run/zabbix/agent.plugin.sock
ControlSocket=/run/zabbix/agent.sock
Include=/etc/zabbix/zabbix_agent2.d/plugins.d/*.conf

 

8. Stop the old zabbix agent process and run the new one

# systemctl status –full zabbix-agent2
# systemctl stop zabbix-agent


Assuming that the configuratoin of zabbix-agent is correct, execute zabbix-agent2 via system control.and check its status
 

# systemctl start zabbix-agent2
# systemctl status –full zabbix-agent2


If no errors in the configuration, the zabbix_agent2 process should be up and running and the status of above systemctl cmd should report fine.
If you need concretics regarding exact Zabbix checks or whther current conigured Userparameter scripts errors, or any other warnings or errors
of zabbix_agent2 interacting to the server, check further the logs

[root@monitored-server ~]# tail -n 10 /var/log/zabbix/zabbix_agent2.log  
2024/08/06 17:26:52.998749 using plugin 'WebPage' (built-in) providing following interfaces: exporter, configurator
2024/08/06 17:26:52.998760 using plugin 'ZabbixAsync' (built-in) providing following interfaces: exporter
2024/08/06 17:26:52.998794 using plugin 'ZabbixStats' (built-in) providing following interfaces: exporter, configurator
2024/08/06 17:26:52.998804 lowering the plugin ZabbixSync capacity to 1 as the configured capacity 100 exceeds limits
2024/08/06 17:26:52.998820 using plugin 'ZabbixSync' (built-in) providing following interfaces: exporter
2024/08/06 17:26:52.998993 Plugin communication protocol version is 6.0.13
2024/08/06 17:26:52.999018 Zabbix Agent2 hostname: [lqgblu02f.ffm.de.int.atosorigin.com]
2024/08/06 17:26:54.000667 [102] cannot connect to [127.0.0.1:10051]: dial tcp :0->127.0.0.1:10051: connect: connection refused
2024/08/06 17:26:54.000836 [102] active check configuration update from host [lqgblu02f.ffm.de.int.atosorigin.com] started to fail
2024/08/06 17:26:59.344837 Zabbix Agent 2 stopped. (6.0.31)

DNS Monitoring: Check and Alert if DNS nameserver resolver of Linux machine is not properly resolving shell script. Monitor if /etc/resolv.conf DNS runs Okay

Thursday, March 14th, 2024

linux-monitor-check-dns-is-resolving-fine

If you happen to have issues occasionally with DNS resolvers and you want to keep up an eye on it and alert if DNS is not properly resolving Domains, because sometimes you seem to have issues due to network disconnects, disturbances (modifications), whatever and you want to have another mean to see whether a DNS was reachable or unreachable for a time, here is a little bash shell script that does the "trick".

Script work mechacnism is pretty straight forward as you can see we check what are the configured nameservers if they properly resolve and if they're properly resolving we write to log everything is okay, otherwise we write to the log DNS is not properly resolvable and send an ALERT email to preconfigured Email address.

Below is the check_dns_resolver.sh script:

 

#!/bin/bash
# Simple script to Monitor DNS set resolvers hosts for availability and trigger alarm  via preset email if any of the nameservers on the host cannot resolve
# Use a configured RESOLVE_HOST to try to resolve it via available configured nameservers in /etc/resolv.conf
# if machines are not reachable send notification email to a preconfigured email
# script returns OK 1 if working correctly or 0 if there is issue with resolving $RESOLVE_HOST on $SELF_HOSTNAME and mail on $ALERT_EMAIL
# output of script is to be kept inside DNS_status.log

ALERT_EMAIL='your.email.address@email-fqdn.com';
log=/var/log/dns_status.log;
TIMEOUT=3; DNS=($(grep -R nameserver /etc/resolv.conf | cut -d ' ' -f2));  

SELF_HOSTNAME=$(hostname –fqdn);
RESOLVE_HOST=$(hostname –fqdn);

for i in ${DNS[@]}; do dns_status=$(timeout $TIMEOUT nslookup $RESOLVE_HOST  $i); 

if [[ “$?” == ‘0’ ]]; then echo "$(date "+%y.%m.%d %T") $RESOLVE_HOST $i on host $SELF_HOST OK 1" | tee -a $log; 
else 
echo "$(date "+%y.%m.%d %T")$RESOLVE_HOST $i on host $SELF_HOST NOT_OK 0" | tee -a $log; 

echo "$(date "+%y.%m.%d %T") $RESOLVE_HOST $i DNS on host $SELF_HOST resolve ERROR" | mail -s "$RESOLVE_HOST /etc/resolv.conf $i DNS on host $SELF_HOST resolve ERROR";

fi

 done

Download check_dns_resolver.sh here set the script to run via a cron job every lets say 5 minutes, for example you can set a cronjob like this:
 

# crontab -u root -e
*/5 * * * *  check_dns_resolver.sh 2>&1 >/dev/null

 

Then Voila, check the log /var/log/dns_status.log if you happen to run inside a service downtime and check its output with the rest of infrastructure componets, network switch equipment, other connected services etc, that should keep you in-line to proof during eventual RCA (Root Cause Analysis) if complete high availability system gets down to proof your managed Linux servers was not the reason for the occuring service unavailability.

A simplified variant of the check_dns_resolver.sh can be easily integrated to do Monitoring with Zabbix userparameter script and DNS Check Template containing few Triggers, Items and Action if I have time some time in the future perhaps, I'll blog a short article on how to configure such DNS zabbix monitoring, the script zabbix variant of the DNS monitor script is like this:

[root@linux-server bin]# cat check_dns_resolver.sh 
#!/bin/bash
TIMEOUT=3; DNS=($(grep -R nameserver /etc/resolv.conf | cut -d ' ' -f2));  for i in ${DNS[@]}; do dns_status=$(timeout $TIMEOUT nslookup $(hostname –fqdn) $i); if [[ “$?” == ‘0’ ]]; then echo "$i OK 1"; else echo "$i NOT OK 0"; fi; done

[root@linux-server bin]#


Hope this article, will help someone to improve his Unix server Infrastucture monitoring.

Enjoy and Cheers !

How to count number of ESTABLISHED state TCP connections to a Windows server

Wednesday, March 13th, 2024

count-netstat-established-connections-on-windows-server-howto-windows-logo-debug-network-issues-windows

Even if you have the background of a Linux system administrator, sooner or later you will have have to deal with some Windows hosts, thus i'll blog in this article shortly on how the established TCP if it happens you will have to administarte a Windows hosts or help a windows sysadmin noobie 🙂

In Linux it is pretty easy to check the number of established conenctions, because of the wonderful command wc (word count). with a simple command like:
 

$ netstat -etna |wc -l


Then you will get the number of active TCP connections to the machine and based on that you can get an idea on how busy the server is.

But what if you have to deal with lets say a Microsoft Windows 2012 /2019 / 2020 or 2022 Server, assuming you logged in as Administrator and you see the machine is quite loaded and runs multiple Native Windows Administrator common services such as IIS / Active directory Failover Clustering, Proxy server etc.
How can you identify the established number of connections via a simple command in cmd.exe?

1.Count ESTABLISHED TCP connections from Windows Command Line

Here is the answer, simply use netstat native windows command and combine it with find, like that and use the /i (ignores the case of characters when searching the string) /c (count lines containing the string) options

C:\Windows\system32>netstat -p TCP -n|  find /i "ESTABLISHED" /c
1268

Voila, here are number of established connections, only 1268 that is relatively low.
However if you manage Windows servers, and you get some kind of hang ups as part of the monitoring, it is a good idea to setup a script based on this simple command for at least Windows Task Scheduler (the equivallent of Linux's crond service) to log for Peaks in Established connections to see whether Server crashes are not related to High Rise in established connections.
Even better if company uses Zabbix / Nagios, OpenNMS or other  old legacy monitoring stuff like Joschyd even as of today 2024 used in some big of the TOP IT companies such as SAP (they were still using it about 4 years ago for their SAP HANA Cloud), you can set the script to run and do a Monitoring template or Alerting rules to draw you graphs and Trigger Alerts if your connections hits a peak, then you at least might know your Windows server is under a "Hackers" Denial of Service attack or there is something happening on the network, like Cisco Network Infrastructure Switch flappings or whatever.

Perhaps an example script you can use if you decide to implement the little nestat established connection checks Monitoring in Zabbix is the one i've writen about in the previous article "Calculate established connection from IP address with shell script and log to zabbix graphic".

2. Few Useful netstat options for the Windows system admin
 

C:\Windows\System32> netstat -bona


netstat-useful-arguments-for-the-windows-system-administrator

Cmd.exe will lists executable files, local and external IP addresses and ports, and the state in list form. You immediately see which programs have created connections or are listening so that you can find offenders quickly.

b – displays the executable involved in  creating the connection.
o – displays the owning process ID.
n – displays address and port numbers.
a – displays all connections and listening ports.

As you can see in the screenshot, by using netstat -bona you get which process has binded to which local address and the Process ID PID of it, that is pretty useful in debugging stuff.

3. Use a Third Party GUI tool to debug more interactively connection issues

If you need to keep an eye in interactive mode, sometimes if there are issues CurrPorts tool can be of a great help

currports-windows-network-connections-diagnosis-cports

CurrPorts Tool own Description

CurrPorts is network monitoring software that displays the list of all currently opened TCP/IP and UDP ports on your local computer. For each port in the list, information about the process that opened the port is also displayed, including the process name, full path of the process, version information of the process (product name, file description, and so on), the time that the process was created, and the user that created it.
In addition, CurrPorts allows you to close unwanted TCP connections, kill the process that opened the ports, and save the TCP/UDP ports information to HTML file , XML file, or to tab-delimited text file.
CurrPorts also automatically mark with pink color suspicious TCP/UDP ports owned by unidentified applications (Applications without version information and icons).

Sum it up

What we learned is how to calculate number of established TCP connections from command line, useful for scripting, how you can use netstat to display the process ID and Process name that relates to a used Local / Remote TCP connections, and how eventually you can use this to connect it to some monitoring tool to periodically report High Peaks with TCP established connections (usually an indicator of servere system issues).
 

Zabbix script to track arp address cache loss (arp incomplete) from Linux server to gateway IP

Tuesday, January 30th, 2024

Zabbix_arp-network-incomplete-check-logo.svg

Some of the Linux servers recently, I'm responsible had a very annoying issue recently. The problem is ARP address to default configured server gateway is being lost, every now and then and it takes up time, fot the remote CISCO router to realize the problem and resolve it. We have debugged with the Network expert colleague, while he was checking the Cisco router and we were checking the arp table on the Linux server with arp command. And we came to conclusion this behavior is due to some network mess because of too many NAT address configurations on the network or due to a Cisco bug. The colleagues asked Cisco but cisco does not have any solution to the issue and the only close work around for the gateway loosing the mac is to set a network rule on the Cisco router to flush its arp record for the server it was loosing the MAC address for.
This does not really solve completely the problem but at least, once we run into the issue, it gets resolved as quick as 5 minutes time. }

As we run a cluster environment it is useful to Monitor and know immediately once we hit into the MAC gateway disappear issue and if the issue persists, exclude the Linux node from the Cluster so we don't loose connection traffic.
For the purpose of Monitoring MAC state from the Linux haproxy machine towards the Network router GW, I have developed a small userparameter script, that is periodically checking the state of the MAC address of the IP address of remote gateway host and log to a external file for any problems with incomplete MAC address of the Remote configured default router.

In case if you happen to need the same MAC address state monitoring for your servers, I though that might be of a help to anyone out there.
To monitor MAC address incomplete state with Zabbix, do the following:
 

1. Create  userparamater_arp_gw_check.conf Zabbix script
 

# cat userparameter_arp_gw_check.conf 
UserParameter=arp.check,/usr/local/bin/check_gw_arp.sh

 

2. Create the following shell script /usr/local/bin/check_gw_arp.sh

 

#!/bin/bash
# simple script to run on cron peridically or via zabbix userparameter
# to track arp loss issues to gateway IP
#gw_ip='192.168.0.55';
gw_ip=$(ip route show|grep -i default|awk '{ print $3 }');
log_f='/var/log/arp_incomplete.log';
grep_word='incomplete';
inactive_status=$(arp -n "$gw_ip" |grep -i $grep_word);
# if GW incomplete record empty all is ok
if [[ $inactive_status == ” ]]; then 
echo $gw_ip OK 1; 
else 
# log inactive MAC to gw_ip
echo "$(date '+%Y-%m-%d %H:%M:%S')" "ARP_ERROR $inactive_status 0" | tee -a $log_f 2>&1 >/dev/null;
# printout to zabbix
echo "1 ARP FAILED: $inactive_status"; 
fi

You can download the check_gw_arp.sh here.

The script is supposed to automatically grep for the Default Gateway router IP, however before setting it up. Run it and make sure this corresponds correctly to the default Gateway IP MAC you would like to monitor.
 

3. Create New Zabbix Template for ARP incomplete monitoring
 

arp-machine-to-default-gateway-failure-monitoring-template-screenshot

Create Application 

*Name
Default Gateway ARP state

4. Create Item and Dependent Item 
 

Create Zabbix Item and Dependent Item like this

arp-machine-to-default-gateway-failure-monitoring-item-screenshot

 

arp-machine-to-default-gateway-failure-monitoring-item1-screenshot

arp-machine-to-default-gateway-failure-monitoring-item2-screenshot


5. Create Trigger to trigger WARNING or whatever you like
 

arp-machine-to-default-gateway-failure-monitoring-trigger-screenshot


arp-machine-to-default-gateway-failure-monitoring-trigger1-screenshot

arp-machine-to-default-gateway-failure-monitoring-trigger2-screenshot


6. Create Zabbix Action to notify via Email etc.
 

arp-machine-to-default-gateway-failure-monitoring-action1-screenshot

 

arp-machine-to-default-gateway-failure-monitoring-action2-screenshot

That's all. Once you set up this few little things, you can enjoy having monitoring Alerts for your ARP state incomplete on your Linux / Unix servers.
Enjoy !

How to monitor Postfix Mail server work correct with simple one liner Zabbix user parameter script / Simple way to capture and report SMTP machine issues Zabbix template

Thursday, June 22nd, 2023

setup-zabbix-smtp-mail-monitoring-postfix-qmail-exim-with-easy-userparameter-script-and-template-zabbix-logo

In this article, I'm going to show you how to setup a very simple monitoring if a local running SMTP (Postfix / Qmail / Exim) is responding correctly on basic commands. The check would helpfully keep you in track to know whether your configured Linux server local MTA (Mail Transport Agent) is responding on requests on TCP / IP protocol Port 25, as well as a check for process existence of master (that is the main postfix) proccess, as well as the usual postfix spawned sub-processes qmgr (the postfix queue manager), tsl mgr (TLS session cache and PRNG manager), pickup (Postfix local mail pickup) – or email receiving process.

 

Normally a properly configured postfix installation on a Linux whatever you like distribution would look something like below:

#  ps -ef|grep -Ei 'master|postfix'|grep -v grep
root        1959       1  0 Jun21 ?        00:00:00 /usr/libexec/postfix/master -w
postfix     1961    1959  0 Jun21 ?        00:00:00 qmgr -l -t unix -u
postfix     4542    1959  0 Jun21 ?        00:00:00 tlsmgr -l -t unix -u
postfix  2910288    1959  0 11:28 ?        00:00:00 pickup -l -t unix -u

At times, during mail server restarts the amount of processes that are sub spawned by postfix, may very and if you a do a postfix restart

# systemctl restart postfix

The amout of spawned processes running as postfix username might decrease, and only qmgr might be available for second thus in the consequential shown Template the zabbix processes check to make sure the Postfix is properly operational on the Linux machine is made to check for the absolute minumum of 

1. master (postfix process) that runs with uid root
2. and one (postfix) username binded proccess 

If the amount of processes on the host is less than this minimum number and the netcat is unable to simulate a "half-mail" sent, the configured Postfix alarm Action (media and Email) will take place, and you will get immediately notified, that the monitored Mail server has issue!

The idea is to use a small one liner connection with netcat and half simulate a normal SMTP transaction just like you would normally do:

 

root@pcfrxen:/root # telnet localhost 25
Trying 127.0.0.1…
Connected to localhost.
Escape character is '^]'.
220 This is Mail2 Pc-Freak.NET ESMTP
HELO localhost
250 This is Mail2 Pc-Freak.NET
MAIL FROM:<hipopo@pc-freak.net>
250 ok
RCPT TO:<hip0d@remote-smtp-server.com>

 

and then disconnect the connection.

1. Create new zabbix userparameter_smtp_check.conf file

The simple userparameter one liner script to do the task looks like this:

# vi /etc/zabbix/zabbix_agent.d/userparameter_smtp_check.conf

UserParameter=smtp.check,(if [[ $(echo -e “HELO localhost\n MAIL FROM: root@$HOSTNAME\n RCPT TO: report-email@your-desired-mail-server.com\n  QUIT\n” | /usr/bin/nc localhost 25 -w 5 2>&1 | grep -Ei ‘220\s.*\sESMTP\sPostfix|250\s\.*|250\s\.*\sOk|250\s\.*\sOk|221\.*\s\w’|wc -l) == ‘5’ ]]; then echo "SMTP OK 1"; else echo "SMTP NOK 0"; fi)

Set the proper permissions so either file is owned by zabbix:zabbix or it is been able to be read from all system users.
 

# chmod a+r /etc/zabbix/zabbix_agent.d/userparameter_smtp_check.conf

2. Create a new Template for the Mail server monitoring
 


 

Just like any other template name it with what fits you as you see, I've call it PROD SMTP Monitoring, as the template is prepared to specifically monitor In Production Linux machines, and a separate template is used to monitor the Quality Assurance (QAs) as well as PreProd (Pre Productions).

3. Create the followng Items and Depedent Item to process zabbix-agent received data from the Userparam script
 

Above is the list of basic Items and Dependent Item you will need to configure inside the SMTP Check zabbix Template.

The Items should have the following content and configurations:
 

/postfix-main-proc-service-item-zabbix-shot


*Name: postfix_main_proc.service
Type: Zabbix agent(active)
*Key: proc.num[master,root]
Type of Information: Numeric (unassigned)
*Update interval: 30s
Custom Intervals: Flexible
*History storage period: 90d
*Trend storage period: 365d
Show Value: as is
Applications: Postfix Checks
Populated host inventory field: -None-
Description: The item counts master daemon process that runs Postfix daemons on demand

Where the arguments pased to proc.num[] function are:
  master is the process that is being looked up for and root is the username with which the the postfix master daemon is running. If you need to adapt it for qmail or exim that shouldn't be a big deal you only have to in advance check the exact processes that are normally running on the machine
and configure a similar process check for it.

*Name: postfix_sub_procs.service_cnt
Type: Zabbix agent(active)
*Key: proc.num[,postfix]
Type of information: Numeric (unassigned)
Update Interval: 30s
*History Storage period: Storage Period 90d
*Trend storage period: Storage Period 365d
Description: The item counts master daemon processes that runs postfix daemons on demand.

Here the idea with this Item is to check the number of processes that are running with user / groupid that is postfix. Again for other SMPT different from postfix, just set it to whatever user / group 
you would like zabbix to look up for in Linux the process list. As you can see here the check for existing postfix mta process is done every 30 seconds (for more critical environments you can put it to less).

For simple zabbix use this Dependent Item is not necessery required. But as we would like to process more closely the output of the userparameter smtp script, you have to set it up.
If you want to write graphical representation by sending data to Grafana.

*Name: postfix availability check
Key: postfix_boolean_check[boolean]
Master Item: PROD SMTP Monitoring: postfix availability check
Type of Information: Numeric unassigned
*History storage period: Storage period 90d
*Trend storage period: 365d

Applications: Postfix Checks

Description: It returns boolean value of SMTP check
1 – True (SMTP is OK)
0 – False (SMTP does not responds)

Enabled: Tick

*Name: postfix availability check
*Key: smtp.check
Custom intervals: Flexible
*Update interval: 30 m
History sotrage period: Storage Period 90d
Applications: Postfix Checks
Populates host inventory field: -None-
Description: This check is testing if the SMTP relay is reachable, without actual sending an email
Enabled: Tick

4. Configure following Zabbix Triggers

 

Note: The severity levels you should have previosly set in Zabbix up to your desired ones.

Name: postfix master root process is not running
*Problem Expression: {PROD SMTP Monitoring:proc.num[master,root].last()}<1

OK event generation: Recovery expression
*Recovery Expression: {PROD SMTP Monitoring:proc.num[master,root].last()}>=1
Allow manual close: Tick

Description: The item counts master daemon process that runs Postfix daemon on demand.
Enabed: Tick

I would like to have an AUTO RESOLVE for any detected mail issues, if an issue gets resolved. That is useful especially if you don't have the time to put the Zabbix monitoring in Maintainance Mode during Operating system planned updates / system reboots or unexpected system reboots due to electricity power loss to the server colocated – Data Center / Rack . 


*Name: postfix master sub processes are not running
*Problem Expression: {P09 PROD SMTP Monitoring:proc.num[,postfix].last()}<1
PROBLEM event generation mode: Single
OK event closes: All problems

*Recovery Expression: {P09 PROD SMTP Monitoring:proc.num[,postfix].last()}>=1
Problem event generation mode: Single
OK event closes: All problems
Allow manual close: Tick
Enabled: Tick

Name: SMTP connectivity check
Severity: WARNING
*Expression: {PROD SMTP Monitoring:postfix_boolen_check[boolean].last()}=0
OK event generation: Expression
PROBLEM even generation mode: SIngle
OK event closes: All problems

Allow manual close: Tick
Enabled: Tick

5. Configure respective Zabbix Action

 

zabbix-configure-Actions-screenshotpng
 

As the service is tagged with 'pci service' tag we define the respective conditions and according to your preferences, add as many conditions as you need for the Zabbix Action to take place.

NOTE! :
Assuming that communication chain beween Zabbix Server -> Zabbix Proxy (if zabbix proxy is used) -> Zabbix Agent works correctly you should start receiving that from the userparameter script in Zabbix with the configured smtp.check userparam key every 30 minutes.

Note that this simple nc check will keep a trail records inside your /var/log/maillog for each netcat connection, so keep in mind that in /var/log/maillog on each host which has configured the SMTP Check zabbix template, you will have some records  similar to:

# tail -n 50 /var/log/maillog
2023-06-22T09:32:18.164128+02:00 lpgblu01f postfix/smtpd[2690485]: improper command pipelining after HELO from localhost[127.0.0.1]:  MAIL FROM: root@your-machine-fqdn-address.com\n RCPT TO: your-supposable-receive-addr@whatever-mail-address.com\n  QUIT\n
2023-06-22T09:32:18.208888+02:00 lpgblu01f postfix/smtpd[2690485]: 32EB02005B: client=localhost[127.0.0.1]
2023-06-22T09:32:18.209142+02:00 lpgblu01f postfix/smtpd[2690485]: disconnect from localhost[127.0.0.1] helo=1 mail=1 rcpt=1 quit=1 commands=4
2023-06-22T10:02:18.889440+02:00 lpgblu01f postfix/smtpd[2747269]: connect from localhost[127.0.0.1]
2023-06-22T10:02:18.889553+02:00 lpgblu01f postfix/smtpd[2747269]: improper command pipelining after HELO from localhost[127.0.0.1]:  MAIL FROM: root@your-machine-fqdn-address.com\n RCPT TO: your-supposable-receive-addr@whatever-mail-address.com\n  QUIT\n
2023-06-22T10:02:18.933933+02:00 lpgblu01f postfix/smtpd[2747269]: E3ED42005B: client=localhost[127.0.0.1]
2023-06-22T10:02:18.934227+02:00 lpgblu01f postfix/smtpd[2747269]: disconnect from localhost[127.0.0.1] helo=1 mail=1 rcpt=1 quit=1 commands=4
2023-06-22T10:32:26.143282+02:00 lpgblu01f postfix/smtpd[2804195]: connect from localhost[127.0.0.1]
2023-06-22T10:32:26.143439+02:00 lpgblu01f postfix/smtpd[2804195]: improper command pipelining after HELO from localhost[127.0.0.1]:  MAIL FROM: root@your-machine-fqdn-address.com\n RCPT TO: your-supposable-receive-addr@whatever-mail-address.com\n  QUIT\n
2023-06-22T10:32:26.186681+02:00 lpgblu01f postfix/smtpd[2804195]: 2D7F72005B: client=localhost[127.0.0.1]
2023-06-22T10:32:26.186958+02:00 lpgblu01f postfix/smtpd[2804195]: disconnect from localhost[127.0.0.1] helo=1 mail=1 rcpt=1 quit=1 commands=4
2023-06-22T11:02:26.924039+02:00 lpgblu01f postfix/smtpd[2860398]: connect from localhost[127.0.0.1]
2023-06-22T11:02:26.924160+02:00 lpgblu01f postfix/smtpd[2860398]: improper command pipelining after HELO from localhost[127.0.0.1]:  MAIL FROM: root@your-machine-fqdn-address.com\n RCPT TO: your-supposable-receive-addr@whatever-mail-address.com\n  QUIT\n
2023-06-22T11:02:26.963014+02:00 lpgblu01f postfix/smtpd[2860398]: EB08C2005B: client=localhost[127.0.0.1]
2023-06-22T11:02:26.963257+02:00 lpgblu01f postfix/smtpd[2860398]: disconnect from localhost[127.0.0.1] helo=1 mail=1 rcpt=1 quit=1 commands=4
2023-06-22T11:32:29.145553+02:00 lpgblu01f postfix/smtpd[2916905]: connect from localhost[127.0.0.1]
2023-06-22T11:32:29.145664+02:00 lpgblu01f postfix/smtpd[2916905]: improper command pipelining after HELO from localhost[127.0.0.1]:  MAIL FROM: root@your-machine-fqdn-address.com\n RCPT TO: your-supposable-receive-addr@whatever-mail-address.com\n  QUIT\n
2023-06-22T11:32:29.184539+02:00 lpgblu01f postfix/smtpd[2916905]: 2CF7D2005B: client=localhost[127.0.0.1]
2023-06-22T11:32:29.184729+02:00 lpgblu01f postfix/smtpd[2916905]: disconnect from localhost[127.0.0.1] helo=1 mail=1 rcpt=1 quit=1 commands=4

 

 

That's all folks use the :
Configuration -> Host (menu)

and assign the new SMTP check template to as many of the Linux hosts where you have setup the Userparameter script and Enjoy the new mail server monitoring at hand.

Install specific zabbix-agent version / Downgrade Zabbix Agent client to exact preferred old RPM version on CentOS / Fedora / RHEL Linux from repo

Wednesday, June 7th, 2023

zabbix-update-downgrade-on-centos-rhel-fedora-and-other-rpm-based-linux-zabbix-logo

 

In below article, I'll give you the short Update zabbix procedure to specific version release, if you need to have it running in tandem with rest of zabbix infra, as well as expain shortly how to downgrade zabbix version to a specific release number
to match your central zabbix-serveror central zabbix proxies.

The article is based on personal experience how to install / downgrade the specific zabbuix-agent  release on RPM based distros.
I know this is pretty trivial stuff but still, hope this might be useful to some sysadmin out there thus I decided to quickly blog it.

 

1. Prepare backup of zabbix_agentd.conf
 

cp -rpf /etc/zabbix/zabbix_agentd.conf /home/your-user/zabbix_agentd.conf.bak.$(date +"%b-%d-%Y")

 

2. Create zabbix repo source file in yum.repos.d directory

cd /etc/yum.repos.d/
vim zabbix.repo 

 

[zabbix-5.0]

name=Zabbix 5.0 repo

baseurl=http://zabixx-rpm-mirrors-site.com/centos/external/zabbix-5.0/8/x86_64/

enabled=1

gpgcheck=0

 

3. Update zabbix-agent to a specific defined version

yum search zabbix-agent –enablerepo zabbix-5.0

To update zabbix-agent for RHEL 7.*

# yum install zabbix-agent-5.0.34-1.el7.x86_64


For RHEL 8.*

# yum install zabbix-agent-5.0.34-1.el8.x86_64


4. Restart zabbix-agentd and check its status to make sure it works correctly
 

systemctl status zabbix-agentd
systemctl restart zabbix-agentd
# systemctl status zabbix-agentd


Go to zabbix-server WEB GUI interface and check that data is delivered as normally in Latest Data for the host fom recent time, to make sure host monitoring is continuing flawlessly as before change.

NB !: If yum use something like versionlock is enabled remove the versionlock for package and update then, otherwise it will (weirldly look) look like the package is missing.
I'm saying that because I've hit this issue and was wondering why i cannot install the zabbix-agent even though the version is listed, available and downloadable from the repository.


5. Downgrade agent-client to specific version (Install old version of Zabbix from Repo)
 

Sometimes by mistake you might have raised the Zabbix-agent version to be higher release than the zabbix-server's version and thus breach out the Zabbix documentation official recommendation to keep
up the zabbix-proxy, zabbix-server and zabbix-agent at the exactly same version major and minor version releases. 

If so, then you would want to decrease / downgrade the version, to match your Zabbix overall infrastructure exact version for each of Zabbix server -> Zabbix Proxy server -> Agent clients.

To downgrade the version, I prefer to create some backups, just in case for all /etc/zabbix/ configurations and userparameter scripts (from experience this is useful as sometimes some RPM binary update packages might cause /etc/zabbix/zabbix_agentd.conf file to get overwritten. To prevent from restoring zabbix_agentd.conf from your most recent backup hence, I prefer to just crease the zabbix config backups manually.
 

# cd /root

# mkdir -p /root/backup/zabbix-agent 

# tar -czvf zabbix_agent.tar.gz /etc/zabbix/

# tar -xzvf zabbix_agent.tar.gz 


Then list the available installable zabbix-agent versions
 

[root@sysadminshelp:~]# yum –showduplicates list zabbix-agent
Заредени плъгини: fastestmirror
Determining fastest mirrors
 * base: centos.uni-sofia.bg
 * epel: fedora.ipacct.com
 * extras: centos.uni-sofia.bg
 * remi: mirrors.uni-ruse.bg
 * remi-php74: mirrors.uni-ruse.bg
 * remi-safe: mirrors.uni-ruse.bg
 * updates: centos.uni-sofia.bg
Инсталирани пакети
zabbix-agent.x86_64                                                     5.0.30-1.el7                                                     @zabbix
Налични пакети
zabbix-agent.x86_64                                                     5.0.0-1.el7                                                      zabbix
zabbix-agent.x86_64                                                     5.0.1-1.el7                                                      zabbix
zabbix-agent.x86_64                                                     5.0.2-1.el7                                                      zabbix
zabbix-agent.x86_64                                                     5.0.3-1.el7                                                      zabbix
zabbix-agent.x86_64                                                     5.0.4-1.el7                                                      zabbix
zabbix-agent.x86_64                                                     5.0.5-1.el7                                                      zabbix
zabbix-agent.x86_64                                                     5.0.6-1.el7                                                      zabbix
zabbix-agent.x86_64                                                     5.0.7-1.el7                                                      zabbix
zabbix-agent.x86_64                                                     5.0.8-1.el7                                                      zabbix
zabbix-agent.x86_64                                                     5.0.9-1.el7                                                      zabbix
zabbix-agent.x86_64                                                     5.0.10-1.el7                                                     zabbix
zabbix-agent.x86_64                                                     5.0.11-1.el7                                                     zabbix
zabbix-agent.x86_64                                                     5.0.12-1.el7                                                     zabbix
zabbix-agent.x86_64                                                     5.0.13-1.el7                                                     zabbix
zabbix-agent.x86_64                                                     5.0.14-1.el7                                                     zabbix
zabbix-agent.x86_64                                                     5.0.15-1.el7                                                     zabbix
zabbix-agent.x86_64                                                     5.0.16-1.el7                                                     zabbix
zabbix-agent.x86_64                                                     5.0.17-1.el7                                                     zabbix
zabbix-agent.x86_64                                                     5.0.18-1.el7                                                     zabbix
zabbix-agent.x86_64                                                     5.0.19-1.el7                                                     zabbix
zabbix-agent.x86_64                                                     5.0.20-1.el7                                                     zabbix
zabbix-agent.x86_64                                                     5.0.21-1.el7                                                     zabbix
zabbix-agent.x86_64                                                     5.0.22-1.el7                                                     zabbix
zabbix-agent.x86_64                                                     5.0.23-1.el7                                                     zabbix
zabbix-agent.x86_64                                                     5.0.24-1.el7                                                     zabbix
zabbix-agent.x86_64                                                     5.0.25-1.el7                                                     zabbix
zabbix-agent.x86_64                                                     5.0.26-1.el7                                                     zabbix
zabbix-agent.x86_64                                                     5.0.27-1.el7                                                     zabbix
zabbix-agent.x86_64                                                     5.0.28-1.el7                                                     zabbix
zabbix-agent.x86_64                                                     5.0.29-1.el7                                                     zabbix
zabbix-agent.x86_64                                                     5.0.30-1.el7                                                     zabbix
zabbix-agent.x86_64                                                     5.0.31-1.el7                                                     zabbix
zabbix-agent.x86_64                                                     5.0.32-1.el7                                                     zabbix
zabbix-agent.x86_64                                                     5.0.33-1.el7                                                     zabbix
zabbix-agent.x86_64                                                     5.0.34-1.el7                                                     zabbix

 

Next lets install the most recent zabbix-versoin from the CentOS repo, which for me as of time of writting this article is 5.0.34.

# yum downgrade -y zabbix-agent-5.0.34-1.el7

# cp -rpf /root/backup/zabbix-agent/etc/zabbix/zabbix_agentd.conf /etc/zabbix/

# systemctl start zabbix-agent.service

# systemctl enable  zabbix-agent.service
 

# zabbix_agentd -V
zabbix_agentd (daemon) (Zabbix) 5.0.30
Revision 2c96c38fb4b 28 November 2022, compilation time: Nov 28 2022 11:27:43

Copyright (C) 2022 Zabbix SIA
License GPLv2+: GNU GPL version 2 or later <https://www.gnu.org/licenses/>.
This is free software: you are free to change and redistribute it according to
the license. There is NO WARRANTY, to the extent permitted by law.

This product includes software developed by the OpenSSL Project
for use in the OpenSSL Toolkit (http://www.openssl.org/).

Compiled with OpenSSL 1.0.1e-fips 11 Feb 2013
Running with OpenSSL 1.0.1e-fips 11 Feb 2013

 

That's all folks you should be at your custom selected preferred version of zabbix-agent.
Enjoy ! 🙂

Install btop on Debian Linux, btop an advanced htop like monitoring for Linux to beautify your console life

Tuesday, May 30th, 2023

btop-linux-monitoring-tool-screenshot-help-menu

I've accidently stubmled on btop a colorful and interactive ncurses like command line utility to provide you a bunch of information about CPU / memory / disks and processes with nice console graphic in the style of Cubic Player 🙂
Those who love htop and like their consoles to be full of shiny colors, will really appreciate those nice Linux monitoring tool.
To install btop on latest current stable Debian bullseyes, you will have to install it via backports, as the regular Debian repositories does not have the tool available out of the box.

To Add backports packages support for your Debian 11:

1. Edit /etc/apt/sources.list and include following repositories

 

# vim /etc/apt/sources.list

deb http://deb.debian.org/debian bullseye-backports main contrib non-free
deb-src http://deb.debian.org/debian bullseye-backports main contrib non-free


2. Update the known repos list to include it

 

# apt update


3. Install the btop deb package from backports

 

# apt-cache show btop|grep -A 20 -i descrip
Description-en: Modern and colorful command line resource monitor that shows usage and stats
 btop is a modern and colorful command line resource monitor that shows
 usage and stats for processor, memory, disks, network and processes.
 btop features:
  – Easy to use, with a game inspired menu system.
  – Full mouse support, all buttons with a highlighted key is clickable
  and mouse scroll works in process list and menu boxes.
  – Fast and responsive UI with UP, DOWN keys process selection.
  – Function for showing detailed stats for selected process.
  – Ability to filter processes.
  – Easy switching between sorting options.
  – Tree view of processes.
  – Send any signal to selected process.
  – UI menu for changing all config file options.
  – Auto scaling graph for network usage.
  – Shows IO activity and speeds for disks
  – Battery meter
  – Selectable symbols for the graphs
  – Custom presets
  – And more…
  btop is written in C++ and is continuation of bashtop and bpytop.
Description-md5: 73df6c70fe01f5bf05cca0e3031c1fe2
Multi-Arch: foreign
Homepage: https://github.com/aristocratos/btop
Section: utils
Priority: optional
Filename: pool/main/b/btop/btop_1.2.7-1~bpo11+1_amd64.deb
Size: 431500
SHA256: d79e35c420a2ac5dd88ee96305e1ea7997166d365bd2f30e14ef57b556aecb36


 

# apt install -t bullsye-backports btop –yes

Once I installed it, I can straight use it except on some of my Linux machines, which were having a strange encoding $LANG defined, those ones spitted some errors like:

root@freak:~# btop
ERROR: No UTF-8 locale detected!
Use –utf-force argument to force start if you're sure your terminal can handle it.

 


To work around it simply redefine LANG variable and rerun it
 

# export LANG=en_US.UTF8

# btop

 

btop-linux-monitoring-console-beautiful-colorful-tool-graphics-screenshot

btop-linux-monitoring-tool-screenshot-help-menu

Monitor cluster heartbeat lines IP reahability via ping ICMP protocol with Zabbix

Wednesday, April 12th, 2023

https://pc-freak.net/images/zabbix-monitoring-icmp-ping-on-application-crm-clusters-with-userparameter-script-howto

Say you're having an haproxy load balancer cluster with two or more nodes and you are running the servers inside some complex organizational hybrid complex network that is a combination of a local DMZ lans, many switches, dedicated connectivity lines and every now and then it happens for the network to mysteriously go down. Usually simply setting monitoring on the network devices CISCO itself or the smart switches used is enough to give you an overview on what's going on but if haproxy is in the middle of the end application servers and in front of other Load balancers and network equipment sometimes it might happen that due to failure of a network equipment / routing issues or other strange unexpected reasons one of the 2 nodes connectivity might fail down via the configured dedicated additional Heartbeat lines that are usually configured in order to keep away the haproxy CRM Resource Manager cluster thus ending it up in a split brain scenarios.

Assuming that this is the case like it is with us you would definitely want to keep an eye on the connectivity of Connect Line1 and Connect Line2 inside some monitoring software like zabbix. As our company main monitoring software used to monitor our infrastructure is Zabbix in this little article, I'll briefly explain how to configre the network connectivity status change from haproxy node1 and haproxy node2 Load balancer cluster to be monitored via a simple ICMP ping echo checks.

Of course the easies way to configure an ICMP monitor via Zabbix is using EnableRemoteCommands=1 inside /etc/zabbix/zabbix-agentd.conf but if your infrastructure should be of High Security and PCI perhaps this options is prohibited to be used on the servers. This is why to achieve still the ICMP ping checks with EnableRemoteCommands=0 a separate simple bash user parameter script could be used. Read further to find out one way ICMP monitoring with a useparameter script can be achieved with Zabbix.


1. Create the userparameter check for heartbeat lines

root@haproxy1 zabbix_agentd.d]# cat userparameter_check_heartbeat_lines.conf
UserParameter=heartbeat.check,\
/etc/zabbix/scripts/check_heartbeat_lines.sh

root@haproxy2 zabbix_agentd.d]# cat userparameter_check_heartbeat_lines.conf
UserParameter=heartbeat.check,\
/etc/zabbix/scripts/check_heartbeat_lines.sh

2. Create check_heartbeat_lines.sh script which will be actually checking connectivity with simple ping

root@haproxy1 zabbix_agentd.d]# cat /etc/zabbix/scripts/check_heartbeat_lines.sh
#!/bin/bash
hb1=haproxy2-lb1
hb2=haproxy2-lb2
if ping -c 1 $hb1  &> /dev/null
then
  echo "$hb1 1"
else
  echo "$hb1 0"
fi
if ping -c 1 $hb2  &> /dev/null
then
  echo "$hb2 1"
else
  echo "$hb2 0"
fi

[root@haproxy1 zabbix_agentd.d]#

root@haproxy2 zabbix_agentd.d]# cat /etc/zabbix/scripts/check_heartbeat_lines.sh
#!/bin/bash
hb1=haproxy1-hb1
hb2=haproxy1-hb2
if ping -c 1 $hb1  &> /dev/null
then
  echo "$hb1 1"
else
  echo "$hb1 0"
fi
if ping -c 1 $hb2  &> /dev/null
then
  echo "$hb2 1"
else
  echo "$hb2 0"
fi

[root@haproxy2 zabbix_agentd.d]#


3. Test script heartbeat lines first time

Each of the nodes from the cluster are properly pingable via ICMP protocol

The script has to be run on both haproxy1 and haproxy2 Cluster (load) balancer nodes

[root@haproxy-hb1 zabbix_agentd.d]# /etc/zabbix/scripts/check_heartbeat_lines.sh
haproxy2-hb1 1
haproxy2-hb2 1

[root@haproxy-hb2 zabbix_agentd.d]# /etc/zabbix/scripts/check_heartbeat_lines.sh
haproxy1-hb1 1
haproxy1-hb2 1


The status of 1 returned by the script should be considered remote defined haproxy node is reachable / 0 means ping command does not return any ICMP status pings back.

4. Restart the zabbix-agent on both cluster node machines that will be conducting the ICMP ping check

[root@haproxy zabbix_agentd.d]# systemctl restart zabbix-agentd
[root@haproxy zabbix_agentd.d]# systemctl status zabbix-agentd

[root@haproxy zabbix_agentd.d]# tail -n 100 /var/log/zabbix_agentd.log


5. Create Item to process the userparam script

Create Item as follows:

6. Create the Dependent Item required
 

zabbix-heartbeat-check-screenshots/heartbeat-line1-preprocessing

For processing you need to put the following simple regular expression

Name: Regular Expression
Parameters: hb1(\s+)(\d+)
Custom on fail: \2

zabbix-heartbeat-check-screenshots/heartbeat-line2-preprocessing1

zabbix-heartbeat-check-screenshots/heartbeat-lines-triggers

 

7. Create triggers that will be generating the Alert

Create the required triggers as well

zabbix-heartbeat-check-screenshots/heartbeat2-line
Main thing to configure here in Zabbix is below expression

Expression: {FQDN:heartbeat2.last()}<1

triggers_heartbeat1

You can further configure Zabbix Alerts to mail yourself or send via Slack / MatterMost or Teams alarms in case of problems.

Enable PSK encryption on Zabbix Agent (client) sent encrypted monitored datas to Zabbix server

Friday, April 7th, 2023

zabbix-client-server-encryption-public-key-exchange

Those concerned of security and in use of their Zabbix monitored data who communicate Zabbix collected agent
data over internet or via some kind of untrusted network might definitely not enjoy the fact that zabbix-agent sents
its collected data to server in a plain text. Clear text data is allowing any network sniffer to possibly collect your
monitored server and hardware devices data and exposes all data sent over the network to same problems like in the past
the old uencrypted SMTP protocol.

To mitigate those great security hole for the paranoid sys admin it is rather easy to enable PSK (Pre Shared Key) based encryption.
To generate Pre Shared key you have to had to important values present

1. PSK Identity
2. PSK Secret

PSK secret should be minimum of 128 bit (16-byte PSK, entered as 32 hexadecimal digits), and supports up to
2048 bit (256-byte PSK, entered as 512 hexadecimal digits)

Usually something like 256 bit PSK secret on the machine should be strong enough and simply generated by running

# openssl rand -hex 32

1. Agent to zabbix server or proxy connection config

In /etc/zabbix/zabbix_agentd.conf for a Server Active (e.g. server to actively request the client to sent its collected data)
On machine running zabbix-agent should have a configuration similar to:

# cat /etc/zabbix/zabbix_agentd.conf

PidFile=/var/run/zabbix/zabbix_agentd.pid
LogFile=/var/log/zabbix/zabbix_agentd.log
LogFileSize=0

# IP of the machine
SourceIP=10.10.10.30
# turn it on if you need to execute to remote machine commands
EnableRemoteCommands=0

# IP of the server
servers=10.30.50.80
ListenPort=10080

# IP of the machine
ListenIP=10.30.30.31

# IP of the server
ServerActive=10.30.50.80

HostMetadataItem=system.uname
BufferSize=5400
MaxLinesPerSecond=5
Timeout=10
AllowRoot=0
StartAgents=5
LogRemoteCommands=0


# Machine hostname
Hostname=fqdn-of-zabbix-data-collect-server.com
Include=/etc/zabbix/zabbix_agentd.d/*.conf

# Encryption
TLSConnect=psk
TLSAccept=psk
TLSPSKIdentity=PSK to Zabbix Server5
TLSPSKFile=/etc/zabbix/zabbix_agentd.psk


! Important security note

!!! The TLSPSKIdentity value you decide will not be encrypted on transport, so don't use anything sensitive.

Once you include the TSL config

2 Generate / Create Zabbix Agent Key

Generate the key with pseudo-random bites inside /etc/zabbix/zabbix_agentd_key.psk

# cd /etc/zabbix
# openssl rand -hex 32 > zabbix_agentd_key.psk
# chown zabbix:zabbix zabbix_agentd_key.psk
# chmod 600 zabbix_agentd_key.psk

3. Configure PSK encryption in Zabbix Server Web User interface

Go to Zabbix Server User interface in browser and configure the PSK encryption options for the host.

Select the:

'Connections to host' = PSK

'Connections from host' = PSK

'PSK Identity' = [public-value-configured-in-Zabbix-agent-config]

'PSK' = [paste the long hex string generated from the OpenSSL command above]


In some seconds up to a minute or two the Zabbix Server and Agent will successfully communicate using PSK encryption.
Making the monitored data unreadable in plain text for malignant sniffers hanging in the middle equipment between the zabbix-agent and zabbix-server hosts.

4. PSK encryption behind a Proxy

Many companies, nowadays use zabbix proxy for improvement of network infrastrucutre. For example it is used to offload the zabbix-server when multiple zabbix-agents have to report various datas or to monitor servers and devices that are phyisically in separate networks or data centers (are passing through paranoic built firewalls) or monitor locations are having unreliable communications between each other.
 

To enable PSK for communications between your Zabbix Server and Zabbix Proxy.

1. Create a new secret, and add the PSK Identity and Secret to

Administration ⇾ Proxies ⇾ [Your proxy] ⇾ Encryption

2. Adjust the settings inside the zabbix proxies configuration file at /etc/zabbix/zabbix_proxy.conf


If setting up PSK encryption for agents behind a Zabbix proxy, ensure your have

Zabbix Server ⇽⇾ Proxy PSK enabled
first in Zabbix Server UI.

This is because, when you start the Proxy, or do some testing to send some key value to Zabbix server via the proxy with commands :

# zabbix_get -s 127.0.0.1 -k system.hostname
# zabbix_server -R config_cache_reload


config_cache_reload, the Proxy will download all its host settings from the server, and this also includes the servers copy of the secret.

The proxy needs to know the secret since it is now managing the communications on behalf of the server.

3. To add PSK encryption for any Agents behind a proxy, then you continue to set up the Agents as normal by creating a new secret, editing

Configuration ⇾ Hosts ⇾ [Your Host] ⇾ Encryption page

and also editing /etc/zabbix/zabbix_agentd.conf.

Remember that, since your Agents Host configuration in the Zabbix UI will be set as Monitored by Proxy, the PSK settings will be applicable for communications happening between the Zabbix Proxy and the Agent that it is monitoring, not between the Zabbix Server and the Agent behind the proxy.

You can also add PSK Encryption between your Zabbix Proxy and its own local Agent if you want.
You would set its PSK settings in the Proxy Agents host configuration at

Configuration ⇾ Hosts ⇾ [Your proxy] ⇾ Encryption

and modify the settings in the agents on configuration file at /etc/zabbix/zabbix_agentd.conf.
Keep in mind, this is only applicable to communications between the Zabbix Proxy, and its own Agent process.

When setting up PSK encryption for the Zabbix Server, Proxy and Agents, you may see an error in the Proxy logs,

cannot send proxy data to server at "zabbix.your-domain.tld": connection of type "TLS with PSK" is not allowed for proxy "your-proxy".

If you hit this, check that your

Zabbix Server ⇽⇾ Proxy PSK settings

are correct first.

Don't get confused between the Proxies own optional agent process, and its main Proxy process which is required.

Zabbix: Monitor Linux rsyslog configured central log server is rechable with check_log_server_status.sh userparameter script

Wednesday, June 8th, 2022

zabbix-monitor-central-log-server-is-reachable-from-host-with-a-userparamater-script-zabbix-logo

On modern Linux OS servers on Redhat / CentOS / Fedora and Debian based distros log server service is usually running on the system  such as rsyslog (rsyslogd) to make sure the logging from services is properly logged in separate logs under /var/log.

A very common practice on critical server machines in terms of data security, where logs produced by rsyslog daermon needs to be copied over network via TCP or UDP protocol immediately is to copy over the /var/log produced logs to another configured central logging server. Then later every piece of bit generated by rsyslogd could be  overseen by a third party auditor person and useful for any investigation in case of logs integrity is required or at worse case if there is a suspicion that system in question is hacked by a malicious hax0r and logs have been "cleaned" up from any traces leading to the intruder (things usually done locally by hackers) or by any automated script exploit tools since yesr.

This doubled logging of system events to external log server  ipmentioned is very common practice by companies to protect their log data and quite useful for logs to be recovered easily later on from the central logging server machine that could be also setup for example to use rsyslogd to receive logs from other Linux machines in circumstances where some log disappears just like that (things i've seen happen) for any strange reason or gets destroyed by the admins mistake locally on machine / or by any other mean such as filesystem gets damaged. a very common practice by companies to protect their log data.  

Monitor remote logging server is reachable with userparameter script

Assuming that you already have setup a logging from the server hostname A towards the Central logging server log storepool and everything works as expected the next logical step is to have at least some basic way to monitor remote logging server configured is still reachable all the time and respectively rsyslog /var/log/*.* logs gets properly produced on remote side for example with something like a simple TCP remote server port check and reported in case of troubles in zabbix.

To solve that simple task for company where I'm employed, I've developed below check_log_server_status.sh:
 

#!/bin/bash
# @@ for TCP @ for UDP
# check_log_server_status.sh Script to check if configured TCP / UDP logging server in /etc/rsyslog.conf is rechable
# report to zabbix
DELIMITER='@@';
GREP_PORT='5145';
CONNECT_TIMEOUT=5;

PORT=$(grep -Ei "*.* $DELIMITER.*:$GREP_PORT" /etc/rsyslog.conf|awk -F : '{ print $2 }'|sort -rn |uniq);

#for i in $(grep -Ei "*.* $DELIMITER.*:$GREP_PORT" /etc/rsyslog.conf |grep -v '\#'|awk -F"$DELIMITER" '{ print $2 }' | awk -F ':' '{ print $1 }'|sort -rn); do
HOST=$(grep -Ei "*.* $DELIMITER.*:$GREP_PORT" /etc/rsyslog.conf |grep -v '\#'|awk -F"$DELIMITER" '{ print $2 }' | awk -F ':' '{ print $1 }'|sort -rn)

# echo $PORT

if [[ ! -z $PORT ]] && [[ ! -z $HOST ]]; then
SSH_RETURN=$(/bin/ssh $HOST -p $PORT -o ConnectTimeout=$CONNECT_TIMEOUT 2>&1);
else
echo "PROBLEM Port $GREP_PORT not defined in /etc/rsyslog.conf";
fi

##echo SSH_RETURN $SSH_RETURN;
#exit 1;
if [[ $(echo $SSH_RETURN |grep -i ‘Connection timed out during banner exchange’ | wc -l) -eq ‘1’ ]]; then
echo "rsyslogd $HOST:$PORT OK";
fi

if [[ $(echo $SSH_RETURN |grep -i ‘Connection refused’ | wc -l) -eq ‘1’ ]]; then
echo "rsyslogd $HOST:$PORT PROBLEM";
fi

#sleep 2;
#done


You can download a copy of the script check_log_server_status.sh here

Depending on the port the remote rsyslogd central logging server is using configure it in the script with respective port through the DELIMITER='@@', GREP_PORT='5145', CONNECT_TIMEOUT=5 values.

The delimiter is setup as usually in /etc/rsyslog.conf this the remote logging server for TCP IP is configured with @@ prefix to indicated TCP mode should be used.

Below is example from /etc/rsyslog.conf of how the rsyslogd server is configured:

[root@Server-hostA /root]# grep -i @@ /etc/rsyslogd.conf
# central remote Log server IP / port
*.* @@10.10.10.1:5145

To use the script on a machine, where you have a properly configured zabbix-agentd service host connected and reporting data to a zabbix-server monitoring server.

1. Set up the script under /usr/local/bin/check_log_server_status.sh

[root@Server-hostA /root ]# vim /usr/local/bin/check_log_server_status.sh

[root@Server-hostA /root ]# chmod +x /usr/local/bin/check_log_server_status.sh

2. Prepare userparameter_check_log_server.conf with log_server.check Item key

[root@Server-hostA zabbix_agentd.d]# cat userparameter_check_log_server.conf 
UserParameter=log_server.check, /usr/local/bin/check_log_server_status.sh

3. Set in Zabbix some Item such as on below screenshot

 

check-log-server-status-screenshot-linux-item-zabbix.png4. Create a Zabbix trigger 

check-log-server-status-trigger-logserver-is-unreachable-zabbix


The redded hided field in Expression field should be substituted with your actual hostname on which the monitor script will run.