Posts Tagged ‘connectivity’

Monitor cluster heartbeat lines IP reahability via ping ICMP protocol with Zabbix

Wednesday, April 12th, 2023

https://pc-freak.net/images/zabbix-monitoring-icmp-ping-on-application-crm-clusters-with-userparameter-script-howto

Say you're having an haproxy load balancer cluster with two or more nodes and you are running the servers inside some complex organizational hybrid complex network that is a combination of a local DMZ lans, many switches, dedicated connectivity lines and every now and then it happens for the network to mysteriously go down. Usually simply setting monitoring on the network devices CISCO itself or the smart switches used is enough to give you an overview on what's going on but if haproxy is in the middle of the end application servers and in front of other Load balancers and network equipment sometimes it might happen that due to failure of a network equipment / routing issues or other strange unexpected reasons one of the 2 nodes connectivity might fail down via the configured dedicated additional Heartbeat lines that are usually configured in order to keep away the haproxy CRM Resource Manager cluster thus ending it up in a split brain scenarios.

Assuming that this is the case like it is with us you would definitely want to keep an eye on the connectivity of Connect Line1 and Connect Line2 inside some monitoring software like zabbix. As our company main monitoring software used to monitor our infrastructure is Zabbix in this little article, I'll briefly explain how to configre the network connectivity status change from haproxy node1 and haproxy node2 Load balancer cluster to be monitored via a simple ICMP ping echo checks.

Of course the easies way to configure an ICMP monitor via Zabbix is using EnableRemoteCommands=1 inside /etc/zabbix/zabbix-agentd.conf but if your infrastructure should be of High Security and PCI perhaps this options is prohibited to be used on the servers. This is why to achieve still the ICMP ping checks with EnableRemoteCommands=0 a separate simple bash user parameter script could be used. Read further to find out one way ICMP monitoring with a useparameter script can be achieved with Zabbix.


1. Create the userparameter check for heartbeat lines

root@haproxy1 zabbix_agentd.d]# cat userparameter_check_heartbeat_lines.conf
UserParameter=heartbeat.check,\
/etc/zabbix/scripts/check_heartbeat_lines.sh

root@haproxy2 zabbix_agentd.d]# cat userparameter_check_heartbeat_lines.conf
UserParameter=heartbeat.check,\
/etc/zabbix/scripts/check_heartbeat_lines.sh

2. Create check_heartbeat_lines.sh script which will be actually checking connectivity with simple ping

root@haproxy1 zabbix_agentd.d]# cat /etc/zabbix/scripts/check_heartbeat_lines.sh
#!/bin/bash
hb1=haproxy2-lb1
hb2=haproxy2-lb2
if ping -c 1 $hb1  &> /dev/null
then
  echo "$hb1 1"
else
  echo "$hb1 0"
fi
if ping -c 1 $hb2  &> /dev/null
then
  echo "$hb2 1"
else
  echo "$hb2 0"
fi

[root@haproxy1 zabbix_agentd.d]#

root@haproxy2 zabbix_agentd.d]# cat /etc/zabbix/scripts/check_heartbeat_lines.sh
#!/bin/bash
hb1=haproxy1-hb1
hb2=haproxy1-hb2
if ping -c 1 $hb1  &> /dev/null
then
  echo "$hb1 1"
else
  echo "$hb1 0"
fi
if ping -c 1 $hb2  &> /dev/null
then
  echo "$hb2 1"
else
  echo "$hb2 0"
fi

[root@haproxy2 zabbix_agentd.d]#


3. Test script heartbeat lines first time

Each of the nodes from the cluster are properly pingable via ICMP protocol

The script has to be run on both haproxy1 and haproxy2 Cluster (load) balancer nodes

[root@haproxy-hb1 zabbix_agentd.d]# /etc/zabbix/scripts/check_heartbeat_lines.sh
haproxy2-hb1 1
haproxy2-hb2 1

[root@haproxy-hb2 zabbix_agentd.d]# /etc/zabbix/scripts/check_heartbeat_lines.sh
haproxy1-hb1 1
haproxy1-hb2 1


The status of 1 returned by the script should be considered remote defined haproxy node is reachable / 0 means ping command does not return any ICMP status pings back.

4. Restart the zabbix-agent on both cluster node machines that will be conducting the ICMP ping check

[root@haproxy zabbix_agentd.d]# systemctl restart zabbix-agentd
[root@haproxy zabbix_agentd.d]# systemctl status zabbix-agentd

[root@haproxy zabbix_agentd.d]# tail -n 100 /var/log/zabbix_agentd.log


5. Create Item to process the userparam script

Create Item as follows:

6. Create the Dependent Item required
 

zabbix-heartbeat-check-screenshots/heartbeat-line1-preprocessing

For processing you need to put the following simple regular expression

Name: Regular Expression
Parameters: hb1(\s+)(\d+)
Custom on fail: \2

zabbix-heartbeat-check-screenshots/heartbeat-line2-preprocessing1

zabbix-heartbeat-check-screenshots/heartbeat-lines-triggers

 

7. Create triggers that will be generating the Alert

Create the required triggers as well

zabbix-heartbeat-check-screenshots/heartbeat2-line
Main thing to configure here in Zabbix is below expression

Expression: {FQDN:heartbeat2.last()}<1

triggers_heartbeat1

You can further configure Zabbix Alerts to mail yourself or send via Slack / MatterMost or Teams alarms in case of problems.

Hack: Using ssh / curl or wget to test TCP port connection state to remote SSH, DNS, SMTP, MySQL or any other listening service in PCI environment servers

Wednesday, December 30th, 2020

using-curl-ssh-wget-to-test-tcp-port-opened-or-closed-for-web-mysql-smtp-or-any-other-linstener-in-pci-linux-logo

If you work on PCI high security environment servers in isolated local networks where each package installed on the Linux / Unix system is of importance it is pretty common that some basic stuff are not there in most cases it is considered a security hole to even have a simple telnet installed on the system. I do have experience with such environments myself and thus it is pretty daunting stuff so in best case you can use something like a simple ssh client if you're lucky and the CentOS / Redhat / Suse Linux whatever distro has openssh-client package installed.
If you're lucky to have the ssh onboard you can use telnet in same manner as netcat or the swiss army knife (nmap) network mapper tool to test whether remote service TCP / port is opened or not. As often this is useful, if you don't have access to the CISCO / Juniper or other (networ) / firewall equipment which is setting the boundaries and security port restrictions between networks and servers.

Below is example on how to use ssh client to test port connectivity to lets say the Internet, i.e.  Google / Yahoo search engines.
 

[root@pciserver: /home ]# ssh -oConnectTimeout=3 -v google.com -p 23
OpenSSH_7.9p1 Debian-10+deb10u2, OpenSSL 1.1.1g  21 Apr 2020
debug1: Connecting to google.com [172.217.169.206] port 23.
debug1: connect to address 172.217.169.206 port 23: Connection timed out
debug1: Connecting to google.com [2a00:1450:4017:80b::200e] port 23.
debug1: connect to address 2a00:1450:4017:80b::200e port 23: Cannot assign requested address
ssh: connect to host google.com port 23: Cannot assign requested address
root@pcfreak:/var/www/images# ssh -oConnectTimeout=3 -v google.com -p 80
OpenSSH_7.9p1 Debian-10+deb10u2, OpenSSL 1.1.1g  21 Apr 2020
debug1: Connecting to google.com [172.217.169.206] port 80.
debug1: connect to address 172.217.169.206 port 80: Connection timed out
debug1: Connecting to google.com [2a00:1450:4017:807::200e] port 80.
debug1: connect to address 2a00:1450:4017:807::200e port 80: Cannot assign requested address
ssh: connect to host google.com port 80: Cannot assign requested address
root@pcfreak:/var/www/images# ssh google.com -p 80
ssh_exchange_identification: Connection closed by remote host
root@pcfreak:/var/www/images# ssh google.com -p 80 -v -oConnectTimeout=3
OpenSSH_7.9p1 Debian-10+deb10u2, OpenSSL 1.1.1g  21 Apr 2020
debug1: Connecting to google.com [172.217.169.206] port 80.
debug1: connect to address 172.217.169.206 port 80: Connection timed out
debug1: Connecting to google.com [2a00:1450:4017:80b::200e] port 80.
debug1: connect to address 2a00:1450:4017:80b::200e port 80: Cannot assign requested address
ssh: connect to host google.com port 80: Cannot assign requested address
root@pcfreak:/var/www/images# ssh google.com -p 80 -v -oConnectTimeout=5
OpenSSH_7.9p1 Debian-10+deb10u2, OpenSSL 1.1.1g  21 Apr 2020
debug1: Connecting to google.com [142.250.184.142] port 80.
debug1: connect to address 142.250.184.142 port 80: Connection timed out
debug1: Connecting to google.com [2a00:1450:4017:80c::200e] port 80.
debug1: connect to address 2a00:1450:4017:80c::200e port 80: Cannot assign requested address
ssh: connect to host google.com port 80: Cannot assign requested address
root@pcfreak:/var/www/images# ssh google.com -p 80 -v
OpenSSH_7.9p1 Debian-10+deb10u2, OpenSSL 1.1.1g  21 Apr 2020
debug1: Connecting to google.com [172.217.169.206] port 80.
debug1: Connection established.
debug1: identity file /root/.ssh/id_rsa type 0
debug1: identity file /root/.ssh/id_rsa-cert type -1
debug1: identity file /root/.ssh/id_dsa type -1
debug1: identity file /root/.ssh/id_dsa-cert type -1
debug1: identity file /root/.ssh/id_ecdsa type -1
debug1: identity file /root/.ssh/id_ecdsa-cert type -1
debug1: identity file /root/.ssh/id_ed25519 type -1
debug1: identity file /root/.ssh/id_ed25519-cert type -1
debug1: identity file /root/.ssh/id_xmss type -1
debug1: identity file /root/.ssh/id_xmss-cert type -1
debug1: Local version string SSH-2.0-OpenSSH_7.9p1 Debian-10+deb10u2
debug1: ssh_exchange_identification: HTTP/1.0 400 Bad Request

 


debug1: ssh_exchange_identification: Content-Type: text/html; charset=UTF-8


debug1: ssh_exchange_identification: Referrer-Policy: no-referrer


debug1: ssh_exchange_identification: Content-Length: 1555


debug1: ssh_exchange_identification: Date: Wed, 30 Dec 2020 14:13:25 GMT


debug1: ssh_exchange_identification:


debug1: ssh_exchange_identification: <!DOCTYPE html>

debug1: ssh_exchange_identification: <html lang=en>

debug1: ssh_exchange_identification:   <meta charset=utf-8>

debug1: ssh_exchange_identification:   <meta name=viewport content="initial-scale=1, minimum-scale=1, width=device-width">

debug1: ssh_exchange_identification:   <title>Error 400 (Bad Request)!!1</title>

debug1: ssh_exchange_identification:   <style>

debug1: ssh_exchange_identification:     *{margin:0;padding:0}html,code{font:15px/22px arial,sans-serif}html{background:#fff;color:#222;padding:15px}body{margin:7% auto 0;max-width:390px;min-height:180px;padding:30px 0 15px}* > body{background:url(//www.google.com/images/errors/robot.png) 10
debug1: ssh_exchange_identification: 0% 5px no-repeat;padding-right:205px}p{margin:11px 0 22px;overflow:hidden}ins{color:#777;text-decoration:none}a img{border:0}@media screen and (max-width:772px){body{background:none;margin-top:0;max-width:none;padding-right:0}}#logo{background:url(//www.g
debug1: ssh_exchange_identification: oogle.com/images/branding/googlelogo/1x/googlelogo_color_150x54dp.png) no-repeat;margin-left:-5px}@media only screen and (min-resolution:192dpi){#logo{background:url(//www.google.com/images/branding/googlelogo/2x/googlelogo_color_150x54dp.png) no-repeat 0
debug1: ssh_exchange_identification: % 0%/100% 100%;-moz-border-image:url(//www.google.com/images/branding/googlelogo/2x/googlelogo_color_150x54dp.png) 0}}@media only screen and (-webkit-min-device-pixel-ratio:2){#logo{background:url(//www.google.com/images/branding/googlelogo/2x/googlelogo_
debug1: ssh_exchange_identification: color_150x54dp.png) no-repeat;-webkit-background-size:100% 100%}}#logo{display:inline-block;height:54px;width:150px}

debug1: ssh_exchange_identification:   </style>

debug1: ssh_exchange_identification:   <a href=//www.google.com/><span id=logo aria-label=Google></span></a>

debug1: ssh_exchange_identification:   <p><b>400.</b> <ins>That\342\200\231s an error.</ins>

debug1: ssh_exchange_identification:   <p>Your client has issued a malformed or illegal request.  <ins>That\342\200\231s all we know.</ins>

ssh_exchange_identification: Connection closed by remote host

 

Here is another example on how to test remote host whether a certain service such as DNS (bind) or telnetd is enabled and listening on remote local network  IP with ssh

[root@pciserver: /home ]# ssh 192.168.1.200 -p 53 -v -oConnectTimeout=5
OpenSSH_7.9p1 Debian-10+deb10u2, OpenSSL 1.1.1g  21 Apr 2020
debug1: Connecting to 192.168.1.200 [192.168.1.200] port 53.
debug1: connect to address 192.168.1.200 port 53: Connection timed out
ssh: connect to host 192.168.1.200 port 53: Connection timed out

[root@server: /home ]# ssh 192.168.1.200 -p 23 -v -oConnectTimeout=5
OpenSSH_7.9p1 Debian-10+deb10u2, OpenSSL 1.1.1g  21 Apr 2020
debug1: Connecting to 192.168.1.200 [192.168.1.200] port 23.
debug1: connect to address 192.168.1.200 port 23: Connection timed out
ssh: connect to host 192.168.1.200 port 23: Connection timed out


But what if Linux server you have tow work on is so paranoid that you even the ssh client is absent? Well you can use anything else that is capable of doing a connectivity to remote port such as wget or curl. Some web servers or application servers usually have wget or curl as it is integral part for some local shell scripts doing various operation needed for proper services functioning or simply to test locally a local or remote listener services, if that's the case we can use curl to connect and get output of a remote service simulating a normal telnet connection like this:

host:~# curl -vv 'telnet://remote-server-host5:22'
* About to connect() to remote-server-host5 port 22 (#0)
*   Trying 10.52.67.21… connected
* Connected to aflpvz625 (10.52.67.21) port 22 (#0)
SSH-2.0-OpenSSH_5.3

Now lets test whether we can connect remotely to a local net remote IP's Qmail mail server with curls telnet simulation mode:

host:~#  curl -vv 'telnet://192.168.0.200:25'
* Expire in 0 ms for 6 (transfer 0x56066e5ab900)
*   Trying 192.168.0.200…
* TCP_NODELAY set
* Expire in 200 ms for 4 (transfer 0x56066e5ab900)
* Connected to 192.168.0.200 (192.168.0.200) port 25 (#0)
220 This is Mail Pc-Freak.NET ESMTP

Fine it works, lets now test whether a remote server who has MySQL listener service on standard MySQL port TCP 3306 is reachable with curl

host:~#  curl -vv 'telnet://192.168.0.200:3306'
* Expire in 0 ms for 6 (transfer 0x5601fafae900)
*   Trying 192.168.0.200…
* TCP_NODELAY set
* Expire in 200 ms for 4 (transfer 0x5601fafae900)
* Connected to 192.168.0.200 (192.168.0.200) port 3306 (#0)
Warning: Binary output can mess up your terminal. Use "–output -" to tell
Warning: curl to output it to your terminal anyway, or consider "–output
Warning: <FILE>" to save to a file.
* Failed writing body (0 != 107)
* Closing connection 0
root@pcfreak:/var/www/images#  curl -vv 'telnet://192.168.0.200:3306'
* Expire in 0 ms for 6 (transfer 0x5598ad008900)
*   Trying 192.168.0.200…
* TCP_NODELAY set
* Expire in 200 ms for 4 (transfer 0x5598ad008900)
* Connected to 192.168.0.200 (192.168.0.200) port 3306 (#0)
Warning: Binary output can mess up your terminal. Use "–output -" to tell
Warning: curl to output it to your terminal anyway, or consider "–output
Warning: <FILE>" to save to a file.
* Failed writing body (0 != 107)
* Closing connection 0

As you can see the remote connection is returning binary data which is unknown to a standard telnet terminal thus to get the output received we need to pass curl suggested arguments.

host:~#  curl -vv 'telnet://192.168.0.200:3306' –output –
* Expire in 0 ms for 6 (transfer 0x55b205c02900)
*   Trying 192.168.0.200…
* TCP_NODELAY set
* Expire in 200 ms for 4 (transfer 0x55b205c02900)
* Connected to 192.168.0.200 (192.168.0.200) port 3306 (#0)
g


The curl trick used to troubleshoot remote port to remote host from a Windows OS host which does not have telnet installed by default but have curl instead.

Also When troubleshooting vSphere Replication, it is often necessary to troubleshoot port connectivity as common Windows utilities are not available.
As Curl is available in the VMware vCenter Server Appliance command line interface.

On servers where curl is not there but you have wget is installed you can use it also to test a remote port

 

# wget -vv -O /dev/null http://google.com:554 –timeout=5
–2020-12-30 16:54:22–  http://google.com:554/
Resolving google.com (google.com)… 172.217.169.206, 2a00:1450:4017:80b::200e
Connecting to google.com (google.com)|172.217.169.206|:554… failed: Connection timed out.
Connecting to google.com (google.com)|2a00:1450:4017:80b::200e|:554… failed: Cannot assign requested address.
Retrying.

–2020-12-30 16:54:28–  (try: 2)  http://google.com:554/
Connecting to google.com (google.com)|172.217.169.206|:554… ^C

As evident from output the port 554 is filtered in google which is pretty normal.

If curl or wget is not there either as a final alternative you can either install some perl, ruby, python or bash script etc. that can opens a remote socket to the remote IP.

How to check Host is up with Nagios for servers with disabled ICMP (ping) protocol

Friday, July 15th, 2011

At the company where I administrate some servers, they’re running Nagios to keep track of the servers status and instantly report if problems with connectivity to certain servers occurs.

Now one of the servers which had configured UP host checks is up, but because of heavy ICMP denial of service attacks to the servers the ICMP protocol ping is completely disabled.

In Nagios this host was constantly showing as DOWN in the usual red color, so nagios reported issue even though all services on the client are running fine.

As this is quite annoying, I checked if Nagios supports host checking without doing the ICMP ping test. It appeared it does through something called in nagios Submit passive check result for host

Enabling the “Submit passive check result for this host” could be done straight from Nagios’s web interface (so I don’t even have to edit configurations! ;).
Here is how I did it. In Nagios I had to navigate to:

Hosts -> Click over my host (hosting1) which showed in red as down

Nagios disable ICMP ping report for hosts

You see my down host which I clicked over showing in red in above pic.

On next Nagios screen I had to select, Disable active checks of this host

Nagios Disable active ICMP checks of this host
and press on the Commit button.

Next following text appears on browser:

Your command request was successfully submitted to Nagios for processing.

Note: It may take a while before the command is actually processed.

Afterwards I had to click on Submit passive check result for this host and in:
Check Output to type in:

check_tcp -p 80

Here is the Screenshot of the Command Options dialog:

Nagios submit passive check with check TCP -p 80

That’s all now Nagious should start checking the down host by doing a query if the webserver on port 80 is up and running instead of pinging it.
As well as the server is no longer shown in the Nagio’s Down host list.