Posts Tagged ‘script’

How to Install and Use Netdata on Debian / Ubuntu Linux for Real-Time Server Monitoring

Wednesday, October 29th, 2025

netdata-server-monitoring-simple-tool-for-linux-bsd-windows

Monitoring system performance is one of the most overlooked aspects of server administration – until something breaks. That is absolutely true for beginners and home brew labs for learnings on how to manage servers and do basic system administration. For novice monitoring servers and infrastructure is not a big deal as usually testing machines are OK to break things. But still if you happen to host yourself some own brew websites a blog or Video streaming servers a torrent server, a tor server, some kind of public proxy or whatever of community advantageous solution for free at home, then monitoring will become important topic at some time when your small thing becomes big.

 While many sysadmins reach for Prometheus or Zabbix, using those comes with much of a complexity and effort to put in installing + setting up the overall server and agent clients as well as configure different monitoring checks.

Anyways sometimes you just need a lightweight, zero-configuration server monitoring tool that gives you instant visibility and you don’t want to learn much to have the basic and a bit of heuristic monitoring in the house.

That’s where Netdata shines.

What is Netdata?

Netdata is an open-source, real-time performance monitoring tool that visualizes CPU, memory, disk, network, and process metrics directly in your browser. It’s written in C and optimized for minimal overhead, making it perfect for both old hardware and modern VPS setups.

Netdata is built to primarly monitor different Linux OS distributions such as CentOS / RHEL / AlmaLinux / Rocky Linux / OpenSuse / Arch Linux / Amazon Linux and Oracle Linux it also has support for Windows OS as well as partially supports Mac OS and FreeBSD / OpenBSD / Dragonfly BSD and some other BSDs but some of its Linux features are not available on those.

1. Install Netdata real time performance monitoring tool


root@pcfreak:~# apt-cache show netdata|grep -i description -A3

Description-en: real-time performance monitoring (metapackage)

 Netdata is distributed, real-time, performance and health monitoring for

 systems and applications. It provides insights of everything happening on the

 systems it runs using interactive web dashboards.

Description-md5: 6843dd310958e94a27dd618821504b8e

Homepage: https://github.com/netdata/netdata

Section: net

Priority: optional

Netdata runs on most Linux distributions. On Debian or Ubuntu, installation is dead simple:

# apt update

# apt install curl -y


To be with the latest version since the default repositories provide usually older release you can use for install directly the kickstart installer from netdata.

# bash <(curl -Ss https://my-netdata.io/kickstart.sh)

This script automatically installs dependencies, sets up the Netdata daemon, and enables it as a systemd service.

Once installed, Netdata starts immediately. You can verify with:

# systemctl status netdata

2. Access the Web Dashboard


Open your browser and navigate to:

http://your-server-ip:19999/

You’ll be greeted by an intuitive dashboard showing real-time graphs of every aspect of your system — CPU load, memory usage, disk I/O, network bandwidth, and more.

If you’re running this on a public server, make sure to secure access with a firewall or reverse proxy, since by default Netdata is open to all IPs.

Example (UFW):

# ufw allow from YOUR.IP.ADDRESS.HERE to any port 19999

# ufw enable

3. Enable Persistent Storage

By default, Netdata stores only live data in memory. To retain historical data:
 

# vim /etc/netdata/netdata.conf

Find the [global] section and modify:

[global]

  history = 86400

  update every = 1

This keeps 24 hours of data at one-second resolution. You can also connect Netdata to a database backend for long-term archiving.

4. Secure with Basic Auth (Optional)

If you want simple password protection:

# apt install apache2-utils

# htpasswd -c /etc/netdata/.htpasswd admin

Then edit /etc/netdata/netdata.conf to include:

[web]

  mode = static-threaded

  default port = 19999

  allow connections from = localhost 192.168.1.*

  basic auth users file = /etc/netdata/.htpasswd

Restart Netdata:

# systemctl restart netdata

Why Netdata is Awesome 

Unlike Prometheus or Grafana, Netdata gives you instant insights without heavy setup. It’s ideal for:

  • Debugging high load or memory leaks in real time
  • Monitoring multiple VPS or embedded devices
  • Visualizing system resource usage with minimal CPU cost

And because it’s written in C, it’s insanely fast — often using less than 1% CPU on idle systems.

Final Thoughts

If you’re running any Linux server – whether for personal projects, web hosting, or experiments – Netdata is one of the easiest ways to visualize what’s happening under the hood.

You can even integrate it into your homelab or connect multiple nodes to the Netdata Cloud for centralized monitoring. And of course the full featured use and all the features of the tool are available just in the cloud version but that is absolutely normal. As the developers of netdata seems to have adopted the usual business model of providing things for free and same time selling another great things to make cash.

Netdata is really cool solution without cloud for people who needs to be able to quickly monitor like 20 or 50 servers without putting too much effort. You can simply install it across the machines and you will get a plenty of monitoring, just open each of the machines inside a separate browser tabs and take a look at what is going on once or 2 times a day. It is very old fashioned way to do monitoring but still can make sense if you don't want to bother too much with developing the monitoring or put any effort in it but still have a kind of obeservability on your mid size computer infrastructure.

So, next time your VPS feels sluggish or your load average spikes, fire up Netdata — your CPU will tell you exactly what’s wrong.

How to prevent /etc/resolv.conf to overwrite on every Linux boot. Make /etc/resolv.conf DNS records permanent forever

Tuesday, February 4th, 2025

how-to-make-prevent-etc-resolv.conf-to-ovewrite-on-every-linux-boot-make-etc-resolv-conf-permanent-forever

Have you recently been seriously bothered, after one of the updates from older to newer Debian / Ubuntu / CentOS or other Linux distributions by the fact /etc/resolv.conf has become a dynamic file that pretty much in the spirit of cloud technologies is being regenerated and ovewritten on each and every system (server) OS update /  reboot and due to that you start getting some wrong inappropriate DNS records /etc/resolv.conf causing you harm to the server infrastructure?

During my set of server infra i have faced that odditty for some years now and i guess every system administrator out there has suffered at a point by having to migrate an older Linux release to a newer one, where something gets messed up with DNS resolving due to that Linux OS new feature of /etc/resolv.conf not being really static any more.

The Dynamic resolv.conf file for glibc resolver is often generated used to be regenerated by resolvconf command and consequentially can be tampered by dhcpd resolved systemd service as well perhaps other mechanism depending on how the different Linux distribution architects make it to behave …

There are more than one ways to stop the annoying /etc/resolv.conf ovewritten behavior

1. Using dhcpd to stop /etc/resolv.conf being overwritten

Using dhcpd either a small null up script can be used or a separate hook script.

The null script would look like this

root@pcfreak:/root# vim /etc/dhcp/dhclient-enter-hooks.d/nodnsupdate

#!/bin/sh
make_resolv_conf() {
    :
}

root@pcfreak:/root# chmod +x /etc/dhcp/dhclient-enter-hooks.d/nodnsupdate

 

This script overrides an internal function called make_resolv_conf() that would normally overwrite resolv.conf and instead does nothing.

On old Ubuntu s and Debian versions this should work.


Alternative method is to use a small hook dhcp script like this:

root@pcfreak:/root# echo 'make_resolv_conf() { :; }' > /etc/dhcp/dhclient-enter-hooks.d/leave_my_resolv_conf_alone
chmod 755 /etc/dhcp/dhclient-enter-hooks.d/leave_my_resolv_conf_alone


Next boot when dhclient runs onreboot or when you manually run sudo ifdown -a ; sudo ifup -a , 
it loads this script nodnsupdate or the hook script and hopefully your manually configured values of /etc/resolv.conf would not mess up your file anymore.

2. Use a chattr and set immutable flag attribute to /etc/resolv.conf to prevent re-boot to ovewrite it

Anyways the universal and simple way "hack" to prevent /etc/resolv.conf many prefer to use instead of dhcp (especially as not everyone is running a dhcp on a server) , to overwrite is to delete the file and make it immutable with chattr (assuming chattr is supported by the filesystem i.e. EXT3 / EXT4 / XFS , you use on the Linux.).

You might need to check the filesystem type, before using chattr.

root@pcfreak:/root# blkid  | awk '{print $1 ,$3, $4}'
/dev/xvda1: TYPE="xfs"
/dev/xvda2: TYPE="LVM2_member"
/dev/mapper/centos-root: TYPE="xfs"
/dev/mapper/centos-swap: TYPE="swap"
/dev/loop0:
/dev/loop1:
/dev/loop2:

 

Normally EXT fs and XFS support it, note that this is not going to be the case with a network filesystem like NFS.

If you have some weird Filesystem type and you try to chattr you will get error like:

chattr: Inappropriate ioctl for device while reading flags on /etc/resolv.conf

To make /etc/resolv.conf file unchangeable on next boot by dhcpd or systemd-resolved

 a systemd service that provides network name resolution to local applications via a D-Bus interface, the resolve NSS service (nss-resolve)
 

root@pcfreak:/root# rm -f /etc/resolv.conf  
{ echo "nameserver 1.1.1.1";
echo "nameserver 1.0.0.1;
echo "search mydomain.com"; } >  /etc/resolv.conf
chattr +i  /etc/resolv.conf
reboot  


Also it is a good think if you don't plan after some update to have unexpected results caused by systemd-resolved doing something strange is to rename to /etc/systemd/resolved.conf.dpkg-bak or completely remove file

/etc/systemd/resolved.conf

To prevent dhcpd to overwrite the server /etc/resolv.conf from something automatically taken from preconfigured central DNS inside the network configurations made from /etc/network/interfaces configurations such as:

        dns-nameservers 127.0.0.1 8.8.8.8 8.8.4.4 207.67.222.222 208.67.220.220


You need to change the DHCP configuration file named dhclient.conf and use the supersede option. 
To so Edit /etc/dhcp/dhclient.conf.

Look for lines like these:

#supersede domain-name "fugue.com home.vix.com";
#prepend domain-name-servers 127.0.0.1;

Remove the preceding “#” comment and use the domain-name and/or domain-name-servers which you want (your DNS FQDN). Save and hopefully the DNS related ovewrite to /etc/resolv.conf would be stopped, e.g. changes inside /etc/resolv.conf mnually done should stay permanent.

Also it is a good practice to disable ddns-update-style direcive inside /etc/dhcp/dhcpd.conf

root@pcfreak:/root# vim /etc/dhcp/dhcpd.conf
##ddns-update-style none;

However on many newer Debian Linux as of 2025 and its .deb based derivative distros, you have to consider the /etc/resolv.conf is a symlink to another file /etc/resolvconf/run/resolv.conf

If that is the case with you then you'll have to set the immutable chattr attribute flag like so

root@pcfreak:~# chattr -V +i /etc/resolvconf/run/resolv.conf
chattr 1.47.0 (5-Feb-2023)
Flags of /etc/resolvconf/run/resolv.conf set as —-i—————–

root@pcfreak:/root# lsattr /etc/resolvconf/run/resolv.conf
—-i—————– /etc/resolvconf/run/resolv.conf

3.  Make /etc/resolv.conf permanent with simple custom a rc.local boot triggered resolv.conf ovewrite from a resolv.conf_actual template file

Consider that due to the increasing complexity of how Linux based OS-es behaves and the fact the Linux is more and more written to fit integration into the Cloud and be as easy as possible to containerize or orchestrate (with lets say docker or some cloud PODs) and other multitude of OS virtualiozation stuff modernities  /etc/resolv.conf might still continue to ovewrite ! 🙂

Thus I've come up with my very own unique and KISS (Keep it Simple Stupid) method to make sure /etc/resolv.conf is kept permanent and ovewritten on every boot for that "hack" trick you only need to have the good old /etc/rc.local enabled – i have written a short article how it can be enabled on newer debian / ubuntu / fedora / centos Linux here.

Prepare your permanent and static /etc/resolv.conf file containing your preferred server DNSes under a file /etc/resolv.conf_actual

Here is an example of one of my /etc/resolv.conf template files that gets ovweritten on each boot.

root@pcfreak:/root# cat /etc/resolv.conf_actual
domain pc-freak.net
search pc-freak.net
#nameserver 192.168.0.1

nameserver 127.0.0.1
nameserver 8.8.8.8
nameserver 8.8.4.4
nameserver 212.39.90.42
nameserver 212.39.90.43
nameserver 208.67.222.222
nameserver 208.67.220.220
options timeout:2 rotate


And in /etc/rc.local place before the exit directive inside the file simple copy over the original /etc/resolv.conf file real location.

Before proceeding to add it to execute /etc/rc.local assure yourself file is being venerated by OS.
 

root@pcfreak:/etc/dhcp# systemctl status rc-local
● rc-local.service – /etc/rc.local Compatibility
     Loaded: loaded (/etc/systemd/system/rc-local.service; enabled; preset: enabled)
    Drop-In: /usr/lib/systemd/system/rc-local.service.d
             └─debian.conf
     Active: active (exited) since Sun 2024-12-08 21:59:01 EET; 1 month 27 days ago
       Docs: man:systemd-rc-local-generator(8)
    Process: 1417 ExecStart=/etc/rc.local start (code=exited, status=0/SUCCESS)
        CPU: 302ms

Notice: journal has been rotated since unit was started, output may be incomplete.

root@pcfreak:/root# vim /etc/rc.local

 

cp -rpf /etc/resolv.conf_actual /etc/resolvconf/run/resolv.conf


NB ! Make sure those line is placed before any exit 0 command in /etc/rc.local otherwise that won''t work

That's it folks 🙂 
Using this simple trick you should be no longer bothered by a mysterious /etc/resolv.conf overwritten on next server reboot or system update (via a puppet / ansible or some other centralized update automation stuff) causing you a service or infrastructure outage.

Enjoy !

How to run multiple processes in parallel with xargs

Wednesday, May 29th, 2024

In our company there is a legacy application which developers run in multiple consoles launching its in intedependent components together in different consoles by simply running each component, the question comes then how this can be scripted so no waste of people is done to manually run the different componets from different parallel consoles. To achive the run in parallel of the multiple programs in parallel and background it with xargs and eval (integrated bash command) in Linux from a single script you can use a simple one liner like in the example.

#run in parallel
xargs -P <n> allows you to run <n> commands in parallel.
time xargs -P 3 -I {} sh -c 'eval "$1"' – {} <<'EOF'
program1; sleep 1; echo 1
program2 ; sleep 2; echo 2
program3; sleep 3; echo 3
echo 4
EOF

You can attune the delay up to the exact requirements or completely remove it and the multi run script is ready, enjoy.

DNS Monitoring: Check and Alert if DNS nameserver resolver of Linux machine is not properly resolving shell script. Monitor if /etc/resolv.conf DNS runs Okay

Thursday, March 14th, 2024

linux-monitor-check-dns-is-resolving-fine

If you happen to have issues occasionally with DNS resolvers and you want to keep up an eye on it and alert if DNS is not properly resolving Domains, because sometimes you seem to have issues due to network disconnects, disturbances (modifications), whatever and you want to have another mean to see whether a DNS was reachable or unreachable for a time, here is a little bash shell script that does the "trick".

Script work mechacnism is pretty straight forward as you can see we check what are the configured nameservers if they properly resolve and if they're properly resolving we write to log everything is okay, otherwise we write to the log DNS is not properly resolvable and send an ALERT email to preconfigured Email address.

Below is the check_dns_resolver.sh script:

 

#!/bin/bash
# Simple script to Monitor DNS set resolvers hosts for availability and trigger alarm  via preset email if any of the nameservers on the host cannot resolve
# Use a configured RESOLVE_HOST to try to resolve it via available configured nameservers in /etc/resolv.conf
# if machines are not reachable send notification email to a preconfigured email
# script returns OK 1 if working correctly or 0 if there is issue with resolving $RESOLVE_HOST on $SELF_HOSTNAME and mail on $ALERT_EMAIL
# output of script is to be kept inside DNS_status.log

ALERT_EMAIL='your.email.address@email-fqdn.com';
log=/var/log/dns_status.log;
TIMEOUT=3; DNS=($(grep -R nameserver /etc/resolv.conf | cut -d ' ' -f2));  

SELF_HOSTNAME=$(hostname –fqdn);
RESOLVE_HOST=$(hostname –fqdn);

for i in ${DNS[@]}; do dns_status=$(timeout $TIMEOUT nslookup $RESOLVE_HOST  $i); 

if [[ “$?” == ‘0’ ]]; then echo "$(date "+%y.%m.%d %T") $RESOLVE_HOST $i on host $SELF_HOST OK 1" | tee -a $log; 
else 
echo "$(date "+%y.%m.%d %T")$RESOLVE_HOST $i on host $SELF_HOST NOT_OK 0" | tee -a $log; 

echo "$(date "+%y.%m.%d %T") $RESOLVE_HOST $i DNS on host $SELF_HOST resolve ERROR" | mail -s "$RESOLVE_HOST /etc/resolv.conf $i DNS on host $SELF_HOST resolve ERROR";

fi

 done

Download check_dns_resolver.sh here set the script to run via a cron job every lets say 5 minutes, for example you can set a cronjob like this:
 

# crontab -u root -e
*/5 * * * *  check_dns_resolver.sh 2>&1 >/dev/null

 

Then Voila, check the log /var/log/dns_status.log if you happen to run inside a service downtime and check its output with the rest of infrastructure componets, network switch equipment, other connected services etc, that should keep you in-line to proof during eventual RCA (Root Cause Analysis) if complete high availability system gets down to proof your managed Linux servers was not the reason for the occuring service unavailability.

A simplified variant of the check_dns_resolver.sh can be easily integrated to do Monitoring with Zabbix userparameter script and DNS Check Template containing few Triggers, Items and Action if I have time some time in the future perhaps, I'll blog a short article on how to configure such DNS zabbix monitoring, the script zabbix variant of the DNS monitor script is like this:

[root@linux-server bin]# cat check_dns_resolver.sh 
#!/bin/bash
TIMEOUT=3; DNS=($(grep -R nameserver /etc/resolv.conf | cut -d ' ' -f2));  for i in ${DNS[@]}; do dns_status=$(timeout $TIMEOUT nslookup $(hostname –fqdn) $i); if [[ “$?” == ‘0’ ]]; then echo "$i OK 1"; else echo "$i NOT OK 0"; fi; done

[root@linux-server bin]#


Hope this article, will help someone to improve his Unix server Infrastucture monitoring.

Enjoy and Cheers !

How to count number of ESTABLISHED state TCP connections to a Windows server

Wednesday, March 13th, 2024

count-netstat-established-connections-on-windows-server-howto-windows-logo-debug-network-issues-windows

Even if you have the background of a Linux system administrator, sooner or later you will have have to deal with some Windows hosts, thus i'll blog in this article shortly on how the established TCP if it happens you will have to administarte a Windows hosts or help a windows sysadmin noobie 🙂

In Linux it is pretty easy to check the number of established conenctions, because of the wonderful command wc (word count). with a simple command like:
 

$ netstat -etna |wc -l


Then you will get the number of active TCP connections to the machine and based on that you can get an idea on how busy the server is.

But what if you have to deal with lets say a Microsoft Windows 2012 /2019 / 2020 or 2022 Server, assuming you logged in as Administrator and you see the machine is quite loaded and runs multiple Native Windows Administrator common services such as IIS / Active directory Failover Clustering, Proxy server etc.
How can you identify the established number of connections via a simple command in cmd.exe?

1.Count ESTABLISHED TCP connections from Windows Command Line

Here is the answer, simply use netstat native windows command and combine it with find, like that and use the /i (ignores the case of characters when searching the string) /c (count lines containing the string) options

C:\Windows\system32>netstat -p TCP -n|  find /i "ESTABLISHED" /c
1268

Voila, here are number of established connections, only 1268 that is relatively low.
However if you manage Windows servers, and you get some kind of hang ups as part of the monitoring, it is a good idea to setup a script based on this simple command for at least Windows Task Scheduler (the equivallent of Linux's crond service) to log for Peaks in Established connections to see whether Server crashes are not related to High Rise in established connections.
Even better if company uses Zabbix / Nagios, OpenNMS or other  old legacy monitoring stuff like Joschyd even as of today 2024 used in some big of the TOP IT companies such as SAP (they were still using it about 4 years ago for their SAP HANA Cloud), you can set the script to run and do a Monitoring template or Alerting rules to draw you graphs and Trigger Alerts if your connections hits a peak, then you at least might know your Windows server is under a "Hackers" Denial of Service attack or there is something happening on the network, like Cisco Network Infrastructure Switch flappings or whatever.

Perhaps an example script you can use if you decide to implement the little nestat established connection checks Monitoring in Zabbix is the one i've writen about in the previous article "Calculate established connection from IP address with shell script and log to zabbix graphic".

2. Few Useful netstat options for the Windows system admin
 

C:\Windows\System32> netstat -bona


netstat-useful-arguments-for-the-windows-system-administrator

Cmd.exe will lists executable files, local and external IP addresses and ports, and the state in list form. You immediately see which programs have created connections or are listening so that you can find offenders quickly.

b – displays the executable involved in  creating the connection.
o – displays the owning process ID.
n – displays address and port numbers.
a – displays all connections and listening ports.

As you can see in the screenshot, by using netstat -bona you get which process has binded to which local address and the Process ID PID of it, that is pretty useful in debugging stuff.

3. Use a Third Party GUI tool to debug more interactively connection issues

If you need to keep an eye in interactive mode, sometimes if there are issues CurrPorts tool can be of a great help

currports-windows-network-connections-diagnosis-cports

CurrPorts Tool own Description

CurrPorts is network monitoring software that displays the list of all currently opened TCP/IP and UDP ports on your local computer. For each port in the list, information about the process that opened the port is also displayed, including the process name, full path of the process, version information of the process (product name, file description, and so on), the time that the process was created, and the user that created it.
In addition, CurrPorts allows you to close unwanted TCP connections, kill the process that opened the ports, and save the TCP/UDP ports information to HTML file , XML file, or to tab-delimited text file.
CurrPorts also automatically mark with pink color suspicious TCP/UDP ports owned by unidentified applications (Applications without version information and icons).

Sum it up

What we learned is how to calculate number of established TCP connections from command line, useful for scripting, how you can use netstat to display the process ID and Process name that relates to a used Local / Remote TCP connections, and how eventually you can use this to connect it to some monitoring tool to periodically report High Peaks with TCP established connections (usually an indicator of servere system issues).
 

How to monitor Postfix Mail server work correct with simple one liner Zabbix user parameter script / Simple way to capture and report SMTP machine issues Zabbix template

Thursday, June 22nd, 2023

setup-zabbix-smtp-mail-monitoring-postfix-qmail-exim-with-easy-userparameter-script-and-template-zabbix-logo

In this article, I'm going to show you how to setup a very simple monitoring if a local running SMTP (Postfix / Qmail / Exim) is responding correctly on basic commands. The check would helpfully keep you in track to know whether your configured Linux server local MTA (Mail Transport Agent) is responding on requests on TCP / IP protocol Port 25, as well as a check for process existence of master (that is the main postfix) proccess, as well as the usual postfix spawned sub-processes qmgr (the postfix queue manager), tsl mgr (TLS session cache and PRNG manager), pickup (Postfix local mail pickup) – or email receiving process.

 

Normally a properly configured postfix installation on a Linux whatever you like distribution would look something like below:

#  ps -ef|grep -Ei 'master|postfix'|grep -v grep
root        1959       1  0 Jun21 ?        00:00:00 /usr/libexec/postfix/master -w
postfix     1961    1959  0 Jun21 ?        00:00:00 qmgr -l -t unix -u
postfix     4542    1959  0 Jun21 ?        00:00:00 tlsmgr -l -t unix -u
postfix  2910288    1959  0 11:28 ?        00:00:00 pickup -l -t unix -u

At times, during mail server restarts the amount of processes that are sub spawned by postfix, may very and if you a do a postfix restart

# systemctl restart postfix

The amout of spawned processes running as postfix username might decrease, and only qmgr might be available for second thus in the consequential shown Template the zabbix processes check to make sure the Postfix is properly operational on the Linux machine is made to check for the absolute minumum of 

1. master (postfix process) that runs with uid root
2. and one (postfix) username binded proccess 

If the amount of processes on the host is less than this minimum number and the netcat is unable to simulate a "half-mail" sent, the configured Postfix alarm Action (media and Email) will take place, and you will get immediately notified, that the monitored Mail server has issue!

The idea is to use a small one liner connection with netcat and half simulate a normal SMTP transaction just like you would normally do:

 

root@pcfrxen:/root # telnet localhost 25
Trying 127.0.0.1…
Connected to localhost.
Escape character is '^]'.
220 This is Mail2 Pc-Freak.NET ESMTP
HELO localhost
250 This is Mail2 Pc-Freak.NET
MAIL FROM:<hipopo@pc-freak.net>
250 ok
RCPT TO:<hip0d@remote-smtp-server.com>

 

and then disconnect the connection.

1. Create new zabbix userparameter_smtp_check.conf file

The simple userparameter one liner script to do the task looks like this:

# vi /etc/zabbix/zabbix_agent.d/userparameter_smtp_check.conf

UserParameter=smtp.check,(if [[ $(echo -e “HELO localhost\n MAIL FROM: root@$HOSTNAME\n RCPT TO: report-email@your-desired-mail-server.com\n  QUIT\n” | /usr/bin/nc localhost 25 -w 5 2>&1 | grep -Ei ‘220\s.*\sESMTP\sPostfix|250\s\.*|250\s\.*\sOk|250\s\.*\sOk|221\.*\s\w’|wc -l) == ‘5’ ]]; then echo "SMTP OK 1"; else echo "SMTP NOK 0"; fi)

Set the proper permissions so either file is owned by zabbix:zabbix or it is been able to be read from all system users.
 

# chmod a+r /etc/zabbix/zabbix_agent.d/userparameter_smtp_check.conf

2. Create a new Template for the Mail server monitoring
 


 

Just like any other template name it with what fits you as you see, I've call it PROD SMTP Monitoring, as the template is prepared to specifically monitor In Production Linux machines, and a separate template is used to monitor the Quality Assurance (QAs) as well as PreProd (Pre Productions).

3. Create the followng Items and Depedent Item to process zabbix-agent received data from the Userparam script
 

Above is the list of basic Items and Dependent Item you will need to configure inside the SMTP Check zabbix Template.

The Items should have the following content and configurations:
 

/postfix-main-proc-service-item-zabbix-shot


*Name: postfix_main_proc.service
Type: Zabbix agent(active)
*Key: proc.num[master,root]
Type of Information: Numeric (unassigned)
*Update interval: 30s
Custom Intervals: Flexible
*History storage period: 90d
*Trend storage period: 365d
Show Value: as is
Applications: Postfix Checks
Populated host inventory field: -None-
Description: The item counts master daemon process that runs Postfix daemons on demand

Where the arguments pased to proc.num[] function are:
  master is the process that is being looked up for and root is the username with which the the postfix master daemon is running. If you need to adapt it for qmail or exim that shouldn't be a big deal you only have to in advance check the exact processes that are normally running on the machine
and configure a similar process check for it.

*Name: postfix_sub_procs.service_cnt
Type: Zabbix agent(active)
*Key: proc.num[,postfix]
Type of information: Numeric (unassigned)
Update Interval: 30s
*History Storage period: Storage Period 90d
*Trend storage period: Storage Period 365d
Description: The item counts master daemon processes that runs postfix daemons on demand.

Here the idea with this Item is to check the number of processes that are running with user / groupid that is postfix. Again for other SMPT different from postfix, just set it to whatever user / group 
you would like zabbix to look up for in Linux the process list. As you can see here the check for existing postfix mta process is done every 30 seconds (for more critical environments you can put it to less).

For simple zabbix use this Dependent Item is not necessery required. But as we would like to process more closely the output of the userparameter smtp script, you have to set it up.
If you want to write graphical representation by sending data to Grafana.

*Name: postfix availability check
Key: postfix_boolean_check[boolean]
Master Item: PROD SMTP Monitoring: postfix availability check
Type of Information: Numeric unassigned
*History storage period: Storage period 90d
*Trend storage period: 365d

Applications: Postfix Checks

Description: It returns boolean value of SMTP check
1 – True (SMTP is OK)
0 – False (SMTP does not responds)

Enabled: Tick

*Name: postfix availability check
*Key: smtp.check
Custom intervals: Flexible
*Update interval: 30 m
History sotrage period: Storage Period 90d
Applications: Postfix Checks
Populates host inventory field: -None-
Description: This check is testing if the SMTP relay is reachable, without actual sending an email
Enabled: Tick

4. Configure following Zabbix Triggers

 

Note: The severity levels you should have previosly set in Zabbix up to your desired ones.

Name: postfix master root process is not running
*Problem Expression: {PROD SMTP Monitoring:proc.num[master,root].last()}<1

OK event generation: Recovery expression
*Recovery Expression: {PROD SMTP Monitoring:proc.num[master,root].last()}>=1
Allow manual close: Tick

Description: The item counts master daemon process that runs Postfix daemon on demand.
Enabed: Tick

I would like to have an AUTO RESOLVE for any detected mail issues, if an issue gets resolved. That is useful especially if you don't have the time to put the Zabbix monitoring in Maintainance Mode during Operating system planned updates / system reboots or unexpected system reboots due to electricity power loss to the server colocated – Data Center / Rack . 


*Name: postfix master sub processes are not running
*Problem Expression: {P09 PROD SMTP Monitoring:proc.num[,postfix].last()}<1
PROBLEM event generation mode: Single
OK event closes: All problems

*Recovery Expression: {P09 PROD SMTP Monitoring:proc.num[,postfix].last()}>=1
Problem event generation mode: Single
OK event closes: All problems
Allow manual close: Tick
Enabled: Tick

Name: SMTP connectivity check
Severity: WARNING
*Expression: {PROD SMTP Monitoring:postfix_boolen_check[boolean].last()}=0
OK event generation: Expression
PROBLEM even generation mode: SIngle
OK event closes: All problems

Allow manual close: Tick
Enabled: Tick

5. Configure respective Zabbix Action

 

zabbix-configure-Actions-screenshotpng
 

As the service is tagged with 'pci service' tag we define the respective conditions and according to your preferences, add as many conditions as you need for the Zabbix Action to take place.

NOTE! :
Assuming that communication chain beween Zabbix Server -> Zabbix Proxy (if zabbix proxy is used) -> Zabbix Agent works correctly you should start receiving that from the userparameter script in Zabbix with the configured smtp.check userparam key every 30 minutes.

Note that this simple nc check will keep a trail records inside your /var/log/maillog for each netcat connection, so keep in mind that in /var/log/maillog on each host which has configured the SMTP Check zabbix template, you will have some records  similar to:

# tail -n 50 /var/log/maillog
2023-06-22T09:32:18.164128+02:00 lpgblu01f postfix/smtpd[2690485]: improper command pipelining after HELO from localhost[127.0.0.1]:  MAIL FROM: root@your-machine-fqdn-address.com\n RCPT TO: your-supposable-receive-addr@whatever-mail-address.com\n  QUIT\n
2023-06-22T09:32:18.208888+02:00 lpgblu01f postfix/smtpd[2690485]: 32EB02005B: client=localhost[127.0.0.1]
2023-06-22T09:32:18.209142+02:00 lpgblu01f postfix/smtpd[2690485]: disconnect from localhost[127.0.0.1] helo=1 mail=1 rcpt=1 quit=1 commands=4
2023-06-22T10:02:18.889440+02:00 lpgblu01f postfix/smtpd[2747269]: connect from localhost[127.0.0.1]
2023-06-22T10:02:18.889553+02:00 lpgblu01f postfix/smtpd[2747269]: improper command pipelining after HELO from localhost[127.0.0.1]:  MAIL FROM: root@your-machine-fqdn-address.com\n RCPT TO: your-supposable-receive-addr@whatever-mail-address.com\n  QUIT\n
2023-06-22T10:02:18.933933+02:00 lpgblu01f postfix/smtpd[2747269]: E3ED42005B: client=localhost[127.0.0.1]
2023-06-22T10:02:18.934227+02:00 lpgblu01f postfix/smtpd[2747269]: disconnect from localhost[127.0.0.1] helo=1 mail=1 rcpt=1 quit=1 commands=4
2023-06-22T10:32:26.143282+02:00 lpgblu01f postfix/smtpd[2804195]: connect from localhost[127.0.0.1]
2023-06-22T10:32:26.143439+02:00 lpgblu01f postfix/smtpd[2804195]: improper command pipelining after HELO from localhost[127.0.0.1]:  MAIL FROM: root@your-machine-fqdn-address.com\n RCPT TO: your-supposable-receive-addr@whatever-mail-address.com\n  QUIT\n
2023-06-22T10:32:26.186681+02:00 lpgblu01f postfix/smtpd[2804195]: 2D7F72005B: client=localhost[127.0.0.1]
2023-06-22T10:32:26.186958+02:00 lpgblu01f postfix/smtpd[2804195]: disconnect from localhost[127.0.0.1] helo=1 mail=1 rcpt=1 quit=1 commands=4
2023-06-22T11:02:26.924039+02:00 lpgblu01f postfix/smtpd[2860398]: connect from localhost[127.0.0.1]
2023-06-22T11:02:26.924160+02:00 lpgblu01f postfix/smtpd[2860398]: improper command pipelining after HELO from localhost[127.0.0.1]:  MAIL FROM: root@your-machine-fqdn-address.com\n RCPT TO: your-supposable-receive-addr@whatever-mail-address.com\n  QUIT\n
2023-06-22T11:02:26.963014+02:00 lpgblu01f postfix/smtpd[2860398]: EB08C2005B: client=localhost[127.0.0.1]
2023-06-22T11:02:26.963257+02:00 lpgblu01f postfix/smtpd[2860398]: disconnect from localhost[127.0.0.1] helo=1 mail=1 rcpt=1 quit=1 commands=4
2023-06-22T11:32:29.145553+02:00 lpgblu01f postfix/smtpd[2916905]: connect from localhost[127.0.0.1]
2023-06-22T11:32:29.145664+02:00 lpgblu01f postfix/smtpd[2916905]: improper command pipelining after HELO from localhost[127.0.0.1]:  MAIL FROM: root@your-machine-fqdn-address.com\n RCPT TO: your-supposable-receive-addr@whatever-mail-address.com\n  QUIT\n
2023-06-22T11:32:29.184539+02:00 lpgblu01f postfix/smtpd[2916905]: 2CF7D2005B: client=localhost[127.0.0.1]
2023-06-22T11:32:29.184729+02:00 lpgblu01f postfix/smtpd[2916905]: disconnect from localhost[127.0.0.1] helo=1 mail=1 rcpt=1 quit=1 commands=4

 

 

That's all folks use the :
Configuration -> Host (menu)

and assign the new SMTP check template to as many of the Linux hosts where you have setup the Userparameter script and Enjoy the new mail server monitoring at hand.

Zabbix: Monitor Linux rsyslog configured central log server is rechable with check_log_server_status.sh userparameter script

Wednesday, June 8th, 2022

zabbix-monitor-central-log-server-is-reachable-from-host-with-a-userparamater-script-zabbix-logo

On modern Linux OS servers on Redhat / CentOS / Fedora and Debian based distros log server service is usually running on the system  such as rsyslog (rsyslogd) to make sure the logging from services is properly logged in separate logs under /var/log.

A very common practice on critical server machines in terms of data security, where logs produced by rsyslog daermon needs to be copied over network via TCP or UDP protocol immediately is to copy over the /var/log produced logs to another configured central logging server. Then later every piece of bit generated by rsyslogd could be  overseen by a third party auditor person and useful for any investigation in case of logs integrity is required or at worse case if there is a suspicion that system in question is hacked by a malicious hax0r and logs have been "cleaned" up from any traces leading to the intruder (things usually done locally by hackers) or by any automated script exploit tools since yesr.

This doubled logging of system events to external log server  ipmentioned is very common practice by companies to protect their log data and quite useful for logs to be recovered easily later on from the central logging server machine that could be also setup for example to use rsyslogd to receive logs from other Linux machines in circumstances where some log disappears just like that (things i've seen happen) for any strange reason or gets destroyed by the admins mistake locally on machine / or by any other mean such as filesystem gets damaged. a very common practice by companies to protect their log data.  

Monitor remote logging server is reachable with userparameter script

Assuming that you already have setup a logging from the server hostname A towards the Central logging server log storepool and everything works as expected the next logical step is to have at least some basic way to monitor remote logging server configured is still reachable all the time and respectively rsyslog /var/log/*.* logs gets properly produced on remote side for example with something like a simple TCP remote server port check and reported in case of troubles in zabbix.

To solve that simple task for company where I'm employed, I've developed below check_log_server_status.sh:
 

#!/bin/bash
# @@ for TCP @ for UDP
# check_log_server_status.sh Script to check if configured TCP / UDP logging server in /etc/rsyslog.conf is rechable
# report to zabbix
DELIMITER='@@';
GREP_PORT='5145';
CONNECT_TIMEOUT=5;

PORT=$(grep -Ei "*.* $DELIMITER.*:$GREP_PORT" /etc/rsyslog.conf|awk -F : '{ print $2 }'|sort -rn |uniq);

#for i in $(grep -Ei "*.* $DELIMITER.*:$GREP_PORT" /etc/rsyslog.conf |grep -v '\#'|awk -F"$DELIMITER" '{ print $2 }' | awk -F ':' '{ print $1 }'|sort -rn); do
HOST=$(grep -Ei "*.* $DELIMITER.*:$GREP_PORT" /etc/rsyslog.conf |grep -v '\#'|awk -F"$DELIMITER" '{ print $2 }' | awk -F ':' '{ print $1 }'|sort -rn)

# echo $PORT

if [[ ! -z $PORT ]] && [[ ! -z $HOST ]]; then
SSH_RETURN=$(/bin/ssh $HOST -p $PORT -o ConnectTimeout=$CONNECT_TIMEOUT 2>&1);
else
echo "PROBLEM Port $GREP_PORT not defined in /etc/rsyslog.conf";
fi

##echo SSH_RETURN $SSH_RETURN;
#exit 1;
if [[ $(echo $SSH_RETURN |grep -i ‘Connection timed out during banner exchange’ | wc -l) -eq ‘1’ ]]; then
echo "rsyslogd $HOST:$PORT OK";
fi

if [[ $(echo $SSH_RETURN |grep -i ‘Connection refused’ | wc -l) -eq ‘1’ ]]; then
echo "rsyslogd $HOST:$PORT PROBLEM";
fi

#sleep 2;
#done


You can download a copy of the script check_log_server_status.sh here

Depending on the port the remote rsyslogd central logging server is using configure it in the script with respective port through the DELIMITER='@@', GREP_PORT='5145', CONNECT_TIMEOUT=5 values.

The delimiter is setup as usually in /etc/rsyslog.conf this the remote logging server for TCP IP is configured with @@ prefix to indicated TCP mode should be used.

Below is example from /etc/rsyslog.conf of how the rsyslogd server is configured:

[root@Server-hostA /root]# grep -i @@ /etc/rsyslogd.conf
# central remote Log server IP / port
*.* @@10.10.10.1:5145

To use the script on a machine, where you have a properly configured zabbix-agentd service host connected and reporting data to a zabbix-server monitoring server.

1. Set up the script under /usr/local/bin/check_log_server_status.sh

[root@Server-hostA /root ]# vim /usr/local/bin/check_log_server_status.sh

[root@Server-hostA /root ]# chmod +x /usr/local/bin/check_log_server_status.sh

2. Prepare userparameter_check_log_server.conf with log_server.check Item key

[root@Server-hostA zabbix_agentd.d]# cat userparameter_check_log_server.conf 
UserParameter=log_server.check, /usr/local/bin/check_log_server_status.sh

3. Set in Zabbix some Item such as on below screenshot

 

check-log-server-status-screenshot-linux-item-zabbix.png4. Create a Zabbix trigger 

check-log-server-status-trigger-logserver-is-unreachable-zabbix


The redded hided field in Expression field should be substituted with your actual hostname on which the monitor script will run.

Webserver farm behind Load Balancer Proxy or how to preserve incoming internet IP to local net IP Apache webservers by adding additional haproxy header with remoteip

Monday, April 18th, 2022

logo-haproxy-apache-remoteip-configure-and-check-to-have-logged-real-ip-address-inside-apache-forwarded-from-load-balancer

Having a Proxy server for Load Balancing is a common solutions to assure High Availability of Web Application service behind a proxy.
You can have for example 1 Apache HTTPD webservers serving traffic Actively on one Location (i.e. one city or Country) and 3 configured in the F5 LB or haproxy to silently keep up and wait for incoming connections as an (Active Failure) Backup solution

Lets say the Webservers usually are set to have local class C IPs as 192.168.0.XXX or 10.10.10.XXX and living in isolated DMZed well firewalled LAN network and Haproxy is configured to receive traffic via a Internet IP 109.104.212.13 address and send the traffic in mode tcp via a NATTed connection (e.g. due to the network address translation the source IP of the incoming connections from Intenet clients appears as the NATTed IP 192.168.1.50.

The result is that all incoming connections from haproxy -> webservers will be logged in Webservers /var/log/apache2/access.log wrongly as incoming from source IP: 192.168.1.50, meaning all the information on the source Internet Real IP gets lost.

load-balancer-high-availailibility-haproxy-apache
 

How to pass Real (Internet) Source IPs from Haproxy "mode tcp" to Local LAN Webservers  ?
 

Usually the normal way to work around this with Apache Reverse Proxies configured is to use HTTP_X_FORWARDED_FOR variable in haproxy when using HTTP traffic application that is proxied (.e.g haproxy.cfg has mode http configured), you have to add to listen listener_name directive or frontend Frontend_of_proxy

option forwardfor
option http-server-close

However unfortunately, IP Header preservation with X_FORWADED_FOR  HTTP-Header is not possible when haproxy is configured to forward traffic using mode tcp.

Thus when you're forced to use mode tcp to completely pass any traffic incoming to Haproxy from itself to End side, the solution is to
 

  • Use mod_remoteip infamous module that is part of standard Apache installs both on apache2 installed from (.deb) package  or httpd rpm (on redhats / centos).

 

1. Configure Haproxies to send received connects as send-proxy traffic

 

The idea is very simple all the received requests from outside clients to Haproxy are to be send via the haproxy to the webserver in a PROXY protocol string, this is done via send-proxy

             send-proxy  – send a PROXY protocol string

Rawly my current /etc/haproxy/haproxy.cfg looks like this:
 

global
        log /dev/log    local0
        log /dev/log    local1 notice
        chroot /var/lib/haproxy
        user haproxy
        group haproxy
        daemon
        maxconn 99999
        nbproc          1
        nbthread 2
        cpu-map         1 0
        cpu-map         2 1


defaults
        log     global
       mode    tcp


        timeout connect 5000
        timeout connect 30s
        timeout server 10s

    timeout queue 5s
    timeout tunnel 2m
    timeout client-fin 1s
    timeout server-fin 1s

                option forwardfor

    retries                 15

 

 

frontend http-in
                mode tcp

                option tcplog
        log global

                option logasap
                option forwardfor
                bind 109.104.212.130:80
    fullconn 20000
default_backend http-websrv
backend http-websrv
        balance source
                maxconn 3000

stick match src
    stick-table type ip size 200k expire 30m
        stick on src


        server ha1server-1 192.168.0.205:80 check send-proxy weight 254 backup
        server ha1server-2 192.168.1.15:80 check send-proxy weight 255
        server ha1server-3 192.168.2.30:80 check send-proxy weight 252 backup
        server ha1server-4 192.168.1.198:80 check send-proxy weight 253 backup
                server ha1server-5 192.168.0.1:80 maxconn 3000 check send-proxy weight 251 backup

 

 

frontend https-in
                mode tcp

                option tcplog
                log global

                option logasap
                option forwardfor
        maxconn 99999
           bind 109.104.212.130:443
        default_backend https-websrv
                backend https-websrv
        balance source
                maxconn 3000
        stick on src
    stick-table type ip size 200k expire 30m


                server ha1server-1 192.168.0.205:443 maxconn 8000 check send-proxy weight 254 backup
                server ha1server-2 192.168.1.15:443 maxconn 10000 check send-proxy weight 255
        server ha1server-3 192.168.2.30:443 maxconn 8000 check send-proxy weight 252 backup
        server ha1server-4 192.168.1.198:443 maxconn 10000 check send-proxy weight 253 backup
                server ha1server-5 192.168.0.1:443 maxconn 3000 check send-proxy weight 251 backup

listen stats
    mode http
    option httplog
    option http-server-close
    maxconn 10
    stats enable
    stats show-legends
    stats refresh 5s
    stats realm Haproxy\ Statistics
    stats admin if TRUE

 

After preparing your haproxy.cfg and reloading haproxy in /var/log/haproxy.log you should have the Real Source IPs logged in:
 

root@webserver:~# tail -n 10 /var/log/haproxy.log
Apr 15 22:47:34 pcfr_hware_local_ip haproxy[2914]: 159.223.65.16:58735 [15/Apr/2022:22:47:34.586] https-in https-websrv/ha1server-2 1/0/+0 +0 — 7/7/7/7/0 0/0
Apr 15 22:47:34 pcfr_hware_local_ip haproxy[2914]: 20.113.133.8:56405 [15/Apr/2022:22:47:34.744] https-in https-websrv/ha1server-2 1/0/+0 +0 — 7/7/7/7/0 0/0
Apr 15 22:47:35 pcfr_hware_local_ip haproxy[2914]: 54.36.148.248:15653 [15/Apr/2022:22:47:35.057] https-in https-websrv/ha1server-2 1/0/+0 +0 — 7/7/7/7/0 0/0
Apr 15 22:47:35 pcfr_hware_local_ip haproxy[2914]: 185.191.171.35:26564 [15/Apr/2022:22:47:35.071] https-in https-websrv/ha1server-2 1/0/+0 +0 — 8/8/8/8/0 0/0
Apr 15 22:47:35 pcfr_hware_local_ip haproxy[2914]: 213.183.53.58:42984 [15/Apr/2022:22:47:35.669] https-in https-websrv/ha1server-2 1/0/+0 +0 — 6/6/6/6/0 0/0
Apr 15 22:47:35 pcfr_hware_local_ip haproxy[2914]: 159.223.65.16:54006 [15/Apr/2022:22:47:35.703] https-in https-websrv/ha1server-2 1/0/+0 +0 — 7/7/7/7/0 0/0
Apr 15 22:47:36 pcfr_hware_local_ip haproxy[2914]: 192.241.113.203:30877 [15/Apr/2022:22:47:36.651] https-in https-websrv/ha1server-2 1/0/+0 +0 — 4/4/4/4/0 0/0
Apr 15 22:47:36 pcfr_hware_local_ip haproxy[2914]: 185.191.171.9:6776 [15/Apr/2022:22:47:36.683] https-in https-websrv/ha1server-2 1/0/+0 +0 — 5/5/5/5/0 0/0
Apr 15 22:47:36 pcfr_hware_local_ip haproxy[2914]: 159.223.65.16:64310 [15/Apr/2022:22:47:36.797] https-in https-websrv/ha1server-2 1/0/+0 +0 — 6/6/6/6/0 0/0
Apr 15 22:47:36 pcfr_hware_local_ip haproxy[2914]: 185.191.171.3:23364 [15/Apr/2022:22:47:36.834] https-in https-websrv/ha1server-2 1/1/+1 +0 — 7/7/7/7/0 0/0

 

2. Enable remoteip proxy protocol on Webservers

Login to each Apache HTTPD and to enable remoteip module run:
 

# a2enmod remoteip


On Debians, the command should produce a right symlink to mods-enabled/ directory
 

# ls -al /etc/apache2/mods-enabled/*remote*
lrwxrwxrwx 1 root root 31 Mar 30  2021 /etc/apache2/mods-enabled/remoteip.load -> ../mods-available/remoteip.load

 

3. Modify remoteip.conf file and allow IPs of haproxies or F5s

 

Configure RemoteIPTrustedProxy for every Source IP of haproxy to allow it to send X-Forwarded-For header to Apache,

Here are few examples, from my apache working config on Debian 11.2 (Bullseye):
 

webserver:~# cat remoteip.conf
RemoteIPHeader X-Forwarded-For
RemoteIPTrustedProxy 192.168.0.1
RemoteIPTrustedProxy 192.168.0.205
RemoteIPTrustedProxy 192.168.1.15
RemoteIPTrustedProxy 192.168.0.198
RemoteIPTrustedProxy 192.168.2.33
RemoteIPTrustedProxy 192.168.2.30
RemoteIPTrustedProxy 192.168.0.215
#RemoteIPTrustedProxy 51.89.232.41

On RedHat / Fedora other RPM based Linux distrubutions, you can do the same by including inside httpd.conf or virtualhost configuration something like:
 

<IfModule remoteip_module>
      RemoteIPHeader X-Forwarded-For
      RemoteIPInternalProxy 192.168.0.0/16
      RemoteIPTrustedProxy 192.168.0.215/32
</IfModule>


4. Enable RemoteIP Proxy Protocol in apache2.conf / httpd.conf or Virtualhost custom config
 

Modify both haproxy / haproxies config as well as enable the RemoteIP module on Apache webservers (VirtualHosts if such used) and either in <VirtualHost> block or in main http config include:

RemoteIPProxyProtocol On


5. Change default configured Apache LogFormat

In Domain Vhost or apache2.conf / httpd.conf

Default logging Format will be something like:
 

LogFormat "%h %l %u %t \"%r\" %>s %O \"%{Referer}i\" \"%{User-Agent}i\"" combined


or
 

LogFormat "%v:%p %h %l %u %t \"%r\" %>s %O \"%{Referer}i\" \"%{User-Agent}i\"" combined

 

Once you find it in /etc/apache2/apache2.conf / httpd.conf or Vhost, you have to comment out this by adding shebang infont of sentence make it look as follows:
 

LogFormat "%v:%p %a %l %u %t \"%r\" %>s %O \"%{Referer}i\" \"%{User-Agent}i\"" vhost_combined
LogFormat "%a %l %u %t \"%r\" %>s %O \"%{Referer}i\" \"%{User-Agent}i\"" combined
LogFormat "%a %l %u %t \"%r\" %>s %O" common
LogFormat "%{Referer}i -> %U" referer
LogFormat "%{User-agent}i" agent


The Changed LogFormat instructs Apache to log the client IP as recorded by mod_remoteip (%a) rather than hostname (%h). For a full explanation of all the options check the official HTTP Server documentation page apache_mod_config on Custom Log Formats.

and reload each Apache server.

on Debian:

# apache2ctl -k reload

On CentOS

# systemctl restart httpd


6. Check proxy protocol is properly enabled on Apaches

 

remoteip module will enable Apache to expect a proxy connect header passed to it otherwise it will respond with Bad Request, because it will detect a plain HTML request instead of Proxy Protocol CONNECT, here is the usual telnet test to fetch the index.htm page.

root@webserver:~# telnet localhost 80
Trying 127.0.0.1…
Connected to localhost.
Escape character is '^]'.
GET / HTTP/1.1

HTTP/1.1 400 Bad Request
Date: Fri, 15 Apr 2022 19:04:51 GMT
Server: Apache/2.4.51 (Debian)
Content-Length: 312
Connection: close
Content-Type: text/html; charset=iso-8859-1

<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>400 Bad Request</title>
</head><body>
<h1>Bad Request</h1>
<p>Your browser sent a request that this server could not understand.<br />
</p>
<hr>
<address>Apache/2.4.51 (Debian) Server at grafana.pc-freak.net Port 80</address>
</body></html>
Connection closed by foreign host.

 

root@webserver:~# telnet localhost 80
Trying 127.0.0.1…
Connected to localhost.
Escape character is '^]'.
HEAD / HTTP/1.1

HTTP/1.1 400 Bad Request
Date: Fri, 15 Apr 2022 19:05:07 GMT
Server: Apache/2.4.51 (Debian)
Connection: close
Content-Type: text/html; charset=iso-8859-1

Connection closed by foreign host.


To test it with telnet you can follow the Proxy CONNECT syntax and simulate you're connecting from a proxy server, like that:
 

root@webserver:~# telnet localhost 80
Trying 127.0.0.1…
Connected to localhost.
Escape character is '^]'.
CONNECT localhost:80 HTTP/1.0

HTTP/1.1 301 Moved Permanently
Date: Fri, 15 Apr 2022 19:13:38 GMT
Server: Apache/2.4.51 (Debian)
Location: https://zabbix.pc-freak.net
Cache-Control: max-age=900
Expires: Fri, 15 Apr 2022 19:28:38 GMT
Content-Length: 310
Connection: close
Content-Type: text/html; charset=iso-8859-1

<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>301 Moved Permanently</title>
</head><body>
<h1>Moved Permanently</h1>
<p>The document has moved <a href="https://zabbix.pc-freak.net">here</a>.</p>
<hr>
<address>Apache/2.4.51 (Debian) Server at localhost Port 80</address>
</body></html>
Connection closed by foreign host.

You can test with curl simulating the proxy protocol CONNECT with:

root@webserver:~# curl –insecure –haproxy-protocol https://192.168.2.30

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
"http://w3.org/TR/html4/loose.dtd">
<html>
<head>
<meta name="generator" content="pc-freak.net tidy">
<script src="https://ssl.google-analytics.com/urchin.js" type="text/javascript">
</script>
<script type="text/javascript">
_uacct = "UA-2102595-3";
urchinTracker();
</script>
<script type="text/javascript">
var gaJsHost = (("https:" == document.location.protocol) ? "https://ssl." : "http://");
document.write(unescape("%3Cscript src='" + gaJsHost + "google-analytics.com/ga.js' type='text/javascript'%3E%3C/script%3E"));
</script>
<script type="text/javascript">
try {
var pageTracker = _gat._getTracker("UA-2102595-6");
pageTracker._trackPageview();
} catch(err) {}
</script>

 

      –haproxy-protocol
              (HTTP) Send a HAProxy PROXY protocol v1 header at the beginning of the connection. This is used by some load balancers and reverse proxies
              to indicate the client's true IP address and port.

              This option is primarily useful when sending test requests to a service that expects this header.

              Added in 7.60.0.


7. Check apache log if remote Real Internet Source IPs are properly logged
 

root@webserver:~# tail -n 10 /var/log/apache2/access.log

213.183.53.58 – – [15/Apr/2022:22:18:59 +0300] "GET /proxy/browse.php?u=https%3A%2F%2Fsteamcommunity.com%2Fmarket%2Fitemordershistogram%3Fcountry HTTP/1.1" 200 12701 "https://www.pc-freak.net" "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:98.0) Gecko/20100101 Firefox/98.0"
88.198.48.184 – – [15/Apr/2022:22:18:58 +0300] "GET /blog/iq-world-rank-country-smartest-nations/?cid=1330192 HTTP/1.1" 200 29574 "-" "Mozilla/5.0 (compatible; DataForSeoBot/1.0; +https://dataforseo.com/dataforseo-bot)"
213.183.53.58 – – [15/Apr/2022:22:19:00 +0300] "GET /proxy/browse.php?u=https%3A%2F%2Fsteamcommunity.com%2Fmarket%2Fitemordershistogram%3Fcountry
HTTP/1.1" 200 9080 "https://www.pc-freak.net" "Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:98.0) Gecko/20100101 Firefox/98.0"
159.223.65.16 – – [15/Apr/2022:22:19:01 +0300] "POST //blog//xmlrpc.php HTTP/1.1" 200 5477 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.114 Safari/537.36"
159.223.65.16 – – [15/Apr/2022:22:19:02 +0300] "POST //blog//xmlrpc.php HTTP/1.1" 200 5477 "-" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/89.0.4389.114 Safari/537.36"
213.91.190.233 – – [15/Apr/2022:22:19:02 +0300] "POST /blog/wp-admin/admin-ajax.php HTTP/1.1" 200 1243 "https://www.pc-freak.net/blog/wp-admin/post.php?post=16754&action=edit" "Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:89.0) Gecko/20100101 Firefox/89.0"
46.10.215.119 – – [15/Apr/2022:22:19:02 +0300] "GET /images/saint-Paul-and-Peter-holy-icon.jpg HTTP/1.1" 200 134501 "https://www.google.com/" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/100.0.4896.75 Safari/537.36 Edg/100.0.1185.39"
185.191.171.42 – – [15/Apr/2022:22:19:03 +0300] "GET /index.html.latest/tutorials/tutorials/penguins/vestnik/penguins/faith/vestnik/ HTTP/1.1" 200 11684 "-" "Mozilla/5.0 (compatible; SemrushBot/7~bl; +http://www.semrush.com/bot.html)"

116.179.37.243 – – [15/Apr/2022:22:19:50 +0300] "GET /blog/wp-content/cookieconsent.min.js HTTP/1.1" 200 7625 "https://www.pc-freak.net/blog/how-to-disable-nginx-static-requests-access-log-logging/" "Mozilla/5.0 (compatible; Baiduspider-render/2.0; +http://www.baidu.com/search/spider.html)"
116.179.37.237 – – [15/Apr/2022:22:19:50 +0300] "GET /blog/wp-content/plugins/google-analytics-dashboard-for-wp/assets/js/frontend-gtag.min.js?ver=7.5.0 HTTP/1.1" 200 8898 "https://www.pc-freak.net/blog/how-to-disable-nginx-static-requests-access-log-logging/" "Mozilla/5.0 (compatible; Baiduspider-render/2.0; +http://www.baidu.com/search/spider.html)"

 

You see from above output remote Source IPs in green are properly logged, so haproxy Cluster is correctly forwarding connections passing on in the Haproxy generated Initial header the Real IP of its remote connect IPs.


Sum it up, What was done?


HTTP_X_FORWARD_FOR is impossible to set, when haproxy is used on mode tcp and all traffic is sent as received from TCP IPv4 / IPv6 Network stack, e.g. modifying any HTTP sent traffic inside the headers is not possible as this might break up the data.

Thus Haproxy was configured to send all its received data by sending initial proxy header with the X_FORWARDED usual Source IP data, then remoteip Apache module was used to make Apache receive and understand haproxy sent Header which contains the original Source IP via the send-proxy functionality and example was given on how to test the remoteip on Webserver is working correctly.

Finally you've seen how to check configured haproxy and webserver are able to send and receive the End Client data with the originator real source IP correctly and those Internet IP is properly logged inside both haproxy and apaches.

Monitor service log is continously growing with Zabbix on Windows with batch userparameter script and trigger Alert if log is unchanged

Thursday, March 17th, 2022

monitor-if-log-file-is-growing-with-zabbix-zabbix-userparameter-script-howto

Recently we had an inteteresting Monitoring work task to achieve. We have an Application that is constantly simulating encrypted connections traffic to a remote side machine and sending specific data on TCP/IP ports.
Communiucation between App Server A -> App Server B should be continous and if all is working as expected App Server A messages output are logged in the Application log file on the machine which by the way Runs
Windows Server 2020.

Sometimes due to Network issues this constant reconnections from the Application S. A to the remote checked machine TCP/IP ports gets interrupted due to LAN issues or a burned Network Switch equipment, misconfiguration on the network due to some Network admin making stoopid stuff etc..

Thus it was important to Monitor somehow whether the log is growing or not and feed the output of whether Application log file is growing or it stuck to a Central Zabbix Server. 
To be able to better understand the task, lets divide the desired outcome in few parts on required:

1. Find The latest file inside a folder C:\Path-to-Service\Monitoring\Log\
2. Open the and check it is current logged records and log the time
3. Re-open file in a short while and check whether in few seconds new records are written
4. Report the status out to Zabbix
5. Make Zabbix Item / Trigger / Action in case if monitored file is not growing

In below article I'll briefly explain how Monitoring a Log on a Machine for growing was implemented using a pure good old WIN .BAT (.batch) script and Zabbix Userparameter key

 

1. Enable userparameter script for Local Zabbix-Agent on the Windows 10 Server Host


Edit Zabbix config file usually on Windows Zabbix installs file is named:

zabbix_agentd.win ]


Uncomment the following lines to enable userparameter support for zabbix-agentd:

 

# Include=c:\zabbix\zabbix_agentd.userparams.conf

Include=c:\zabbix\zabbix_agentd.conf.d\

# Include=c:\zabbix\zabbix_agentd.conf.d\*.conf


2. Create folders for userparameter script and for the userparameter.conf

Before creating userparameter you can to create the folder and grant permissions

Folder name under C:\Zabbix -> zabbix_agentd.conf.d

If you don't want to use Windows Explorer) but do it via cmd line:

C:\Users\LOGUser> mkdir \Zabbix\zabbix_agentd.conf\
C:\User\LOGUser> mkdir \Zabbix\zabbix_scripts\


3. Create Userparameter with some name file ( Userparameter-Monitor-Grow.conf )

In the directory C:\Zabbix\zabbix_agentd.conf.d you should create a config file like:
Userparameter-Monitor-Grow.conf and in it you should have a standard userparameter key and script so file content is:

UserParameter=service.check,C:\Zabbix\zabbix_scripts\GROW_LOG_MONITOR-USERPARAMETER.BAT


4. Create the Batch script that will read the latest file in the service log folder and will periodically check and report to zabbix that file is changing

notepad C:\Zabbix\zabbix_scripts\GROW_LOG_MONITOR-USERPARAMETER.BAT

REM "SCRIPT MONITOR IF FILE IS GROWING OR NOT"
@echo off

set work_dir=C:\Path-to-Service\Monitoring\Log\

set client=client Name

set YYYYMMDD=%DATE:~10,4%%DATE:~4,2%%DATE:~7,2%

set name=csv%YYYYMMDD%.csv

set mytime=%TIME:~0,8%

for %%I in (..) do set CurrDirName=%%~nxI

 

setlocal EnableDelayedExpansion

set "line1=findstr /R /N "^^" %work_dir%\output.csv | find /C ":""


for /f %%a in ('!line1!') do set number1=%%a

set "line2=findstr /R /N "^^" %work_dir%\%name% | find /C ":""


for /f %%a in ('!line2!') do set number2=%%a

 

IF  %number1% == %number2% (

echo %YYYYMMDD% %mytime% MAJOR the log is not incrementing for %client%

echo %YYYYMMDD% %mytime% MAJOR the log is not incrementing for %client% >> monitor-grow_err.log

) ELSE (

echo %YYYYMMDD% %mytime% NORMAL the log is incrementing for %client%

SETLOCAL DisableDelayedExpansion

del %work_dir%\output.csv

FOR /F "usebackq delims=" %%a in (`"findstr /n ^^ %work_dir%\%name%"`) do (

    set "var=%%a"

    SETLOCAL EnableDelayedExpansion

    set "var=!var:*:=!"

    echo(!var! >> %work_dir%\output.csv

    ENDLOCAL

)

)
 

 

To download GROW_LOG_MONITOR-USERPARAMETER.BAT click here.
The script needs to have configured the path to directory containing multiple logs produced by the Monitored Application.
As prior said it will, list the latest created file based on DATE timestamp in the folder will output a simple messages:

If the log file is being fed with data the script will output to output.csv messages continuously, either:

%%mytime%% NORMAL the log is incrementing for %%client%%

Or if the Monitored application log is not writting anything for a period it will start writting messages

%%mytime%%mytime MAJOR the log is not incrementing for %client%

The messages will also be sent in Zabbix.

Before saving the script make sure you modify the Full Path location to the Monitored file for growing, i.e.:

set work_dir=C:\Path-to-Service\Monitoring\Log\


5. Create The Zabbix Item

Set whatever service.check name you would like and a check interval to fetch the info from the userparameter (if you're dealing with very large log files produced by Monitored log of application, then 10 minutes might be too frequent, in most cases 10 minutes should be fine)
monitor-if-log-grows-windows-zabbix-item-service-check-screenshot
 

6. Create Zabbix Trigger


You will need a Trigger something similar to below:

Now considering that zabbix server receives correctly data from the client and the monitored log is growing you should in Zabbix:

%%mytime%% NORMAL the log is incrementing for %%client%%


7. Lastly create an Action to send Email Alert if log is not growing

Linux script to periodically log enabled systemctl services, configured network IPs and routings, server established connections and iptables firewall rules

Tuesday, January 25th, 2022

bash-script-command-line-script-logo

For those who are running some kind of server be it virtual or physical, where multiple people or many systemins have access, sometimes it could be quite a mess as someone due to miscommunication or whatever could change something on the configured Network Ethernet interfaces, or configured routing tables, or simply issue an update which might change the set of automatically set to run systemctl services due to update. Such changes on a Linux server Operating system often can remain unnoticed and could cause quite a harm. Even when the change is noticed the logical question occurs what was the previous network route on the server or what kind of network was configured on Ethernet interface ethX etc. 
Problems like the described where, pretty common in many public Private Clouds or VMWare / XEN based Hypervisors that host multiple  Virtual machines, for that reason I've developed a small script which is pretty dumb on the first glimpse but mostly useful as it keeps historical records of such important information.
 

#!/bin/sh
# script to show configured services on system, configured IPs, netstat state and network routes
# Script to be used during CentOS and Redhat Enterprise Linux RPM package updates with yum

output_file=network_ip_routes_services_status;
ddate=$(date '+%Y-%m-%d_%H-%M-%S');
iptables=$(which iptables);
if [ ! -d /root/logs/ ]; then
mkdir /root/logs/;
fi

echo "STARTED: $(date '+%Y-%m-%d_%H-%M-%S'):" | tee -a /root/logs/$output_file-$(hostname)-$ddate.log
echo -e "# systemctl list-unit-files\n" | tee -a /root/logs/$output_file-$(hostname)-$ddate.log
systemctl list-unit-files –type=service | grep enabled | tee -a /root/logs/$output_file-$(hostname)-$ddate.log
echo -e '# systemctl | grep ".service" | grep "running"\n' | tee -a /root/logs/$output_file-$(hostname)-$ddate.log
systemctl | grep ".service" | grep "running" | tee -a /root/logs/$output_file-$(hostname)-$ddate.log
echo -e "# netstat -tulpn\n" | tee -a /root/logs/$output_file-$(hostname)-$ddate.log
netstat -tulpn | tee -a /root/logs/$output_file-$(hostname)-$ddate.log
echo -e "# netstat -r\n" | tee -a /root/logs/$output_file-$(hostname)-$ddate.log
netstat -r | tee -a /root/logs/$output_file-$(hostname)-$ddate.log
echo -e "# ip a s\n" | tee -a /root/logs/$output_file-$(hostname)-$ddate.log
ip a s | tee -a /root/logs/$output_file-$(hostname)-$ddate.log
echo -e "# /sbin/route -n\n" | tee -a /root/logs/$output_file-$(hostname)-$ddate.log
/sbin/route -n | tee -a /root/logs/$output_file-$(hostname)-$ddate.log
echo -e "# $iptables -L -n\n" | tee -a /root/logs/$output_file-$(hostname)-$ddate.log
echo -e "# $iptables -t nat -L" | tee -a /root/logs/$output_file-$(hostname)-$ddate.log
$iptables -L -n | tee -a /root/logs/$output_file-$(hostname)-$ddate.log
$iptables -t nat -L | tee -a /root/logs/$output_file-$(hostname)-$ddate.log
echo "ENDED $(date '+%Y-%m-%d_%H-%M-%S'):" | tee -a /root/logs/$output_file-$(hostname)-$ddate.log

 

Script produces its logs inside  /root/logs/network_ip_routes_services_status*hostname*currentdate*.log, put the script inside /root/ or wherever you like.

To keep an eye how network routing or ip configuration or firewall changed or there was a peak with the established connections towards daemons running on host (lets say requiring a machine upgrade), I've set the script to run as usually via cron job at the end of the predefined cron job tasks, like so:

# crontab -u root -e
# periodic dump and log network routing tables, netstat and systemctl list-unit-files
*/1 01 01,25 * * /root/show_running_services_netstat_ips_route1.sh 2>&1 >/dev/null

You can download a copy of show_running_services_netstat_ips_route1.sh script here.
The script is written without much of efficiency on mind, as you can see the with the multiple tee -a and for critical hosts it might be a good idea to rewrite it to use '>>' OPERAND instead, anyhows as most machines today are pretty powerful it doesn't really matter much.

Of course today such a script is quite archaic, as most big corporations are using much more complex monitoring software such as Zabbix, Prometheus or if some kind of Elastic Search is used Kibana etc. but for a basic needs and even for a double checking and comparing with other more advanced monitoring tools  (in case if monitoring tools  database gets damaged or temporary down until backupped), still I think such an oldschool simple monitoring script can be of use.

A good addition to that if you use a central logging server is to set another cron to periodically synchronize produced /root/logs/* to somewhere, here is how to do it with simple rsync (considering your host is configured to login with a user without password with ssh key authentication).

# HOSTNAME=$(hostname); rsync -axHv –ignore-existing -e 'ssh -p 22' /bashscripts/  -q -i –out-format="%t %f %b" –log-file=/var/log/rsync_sync_jobs.log –info=progress2 root@BACKUP_SERVER_HOST:/$(HOSTNAME)-logs/

Once something strange occurs with the machine, like the machine needs to be rebuild

I would be glad to hear if some of my readers uses some useful script which I can adopt myself. Cheers  🙂