Archive for the ‘Bash Scripting’ Category

DNS Monitoring: Check and Alert if DNS nameserver resolver of Linux machine is not properly resolving shell script. Monitor if /etc/resolv.conf DNS runs Okay

Thursday, March 14th, 2024

linux-monitor-check-dns-is-resolving-fine

If you happen to have issues occasionally with DNS resolvers and you want to keep up an eye on it and alert if DNS is not properly resolving Domains, because sometimes you seem to have issues due to network disconnects, disturbances (modifications), whatever and you want to have another mean to see whether a DNS was reachable or unreachable for a time, here is a little bash shell script that does the "trick".

Script work mechacnism is pretty straight forward as you can see we check what are the configured nameservers if they properly resolve and if they're properly resolving we write to log everything is okay, otherwise we write to the log DNS is not properly resolvable and send an ALERT email to preconfigured Email address.

Below is the check_dns_resolver.sh script:

 

#!/bin/bash
# Simple script to Monitor DNS set resolvers hosts for availability and trigger alarm  via preset email if any of the nameservers on the host cannot resolve
# Use a configured RESOLVE_HOST to try to resolve it via available configured nameservers in /etc/resolv.conf
# if machines are not reachable send notification email to a preconfigured email
# script returns OK 1 if working correctly or 0 if there is issue with resolving $RESOLVE_HOST on $SELF_HOSTNAME and mail on $ALERT_EMAIL
# output of script is to be kept inside DNS_status.log

ALERT_EMAIL='your.email.address@email-fqdn.com';
log=/var/log/dns_status.log;
TIMEOUT=3; DNS=($(grep -R nameserver /etc/resolv.conf | cut -d ' ' -f2));  

SELF_HOSTNAME=$(hostname –fqdn);
RESOLVE_HOST=$(hostname –fqdn);

for i in ${DNS[@]}; do dns_status=$(timeout $TIMEOUT nslookup $RESOLVE_HOST  $i); 

if [[ “$?” == ‘0’ ]]; then echo "$(date "+%y.%m.%d %T") $RESOLVE_HOST $i on host $SELF_HOST OK 1" | tee -a $log; 
else 
echo "$(date "+%y.%m.%d %T")$RESOLVE_HOST $i on host $SELF_HOST NOT_OK 0" | tee -a $log; 

echo "$(date "+%y.%m.%d %T") $RESOLVE_HOST $i DNS on host $SELF_HOST resolve ERROR" | mail -s "$RESOLVE_HOST /etc/resolv.conf $i DNS on host $SELF_HOST resolve ERROR";

fi

 done

Download check_dns_resolver.sh here set the script to run via a cron job every lets say 5 minutes, for example you can set a cronjob like this:
 

# crontab -u root -e
*/5 * * * *  check_dns_resolver.sh 2>&1 >/dev/null

 

Then Voila, check the log /var/log/dns_status.log if you happen to run inside a service downtime and check its output with the rest of infrastructure componets, network switch equipment, other connected services etc, that should keep you in-line to proof during eventual RCA (Root Cause Analysis) if complete high availability system gets down to proof your managed Linux servers was not the reason for the occuring service unavailability.

A simplified variant of the check_dns_resolver.sh can be easily integrated to do Monitoring with Zabbix userparameter script and DNS Check Template containing few Triggers, Items and Action if I have time some time in the future perhaps, I'll blog a short article on how to configure such DNS zabbix monitoring, the script zabbix variant of the DNS monitor script is like this:

[root@linux-server bin]# cat check_dns_resolver.sh 
#!/bin/bash
TIMEOUT=3; DNS=($(grep -R nameserver /etc/resolv.conf | cut -d ' ' -f2));  for i in ${DNS[@]}; do dns_status=$(timeout $TIMEOUT nslookup $(hostname –fqdn) $i); if [[ “$?” == ‘0’ ]]; then echo "$i OK 1"; else echo "$i NOT OK 0"; fi; done

[root@linux-server bin]#


Hope this article, will help someone to improve his Unix server Infrastucture monitoring.

Enjoy and Cheers !

Simple bash shell Script to easy and quickly automate deploy RHEL Linux Virtual Machines on KVM Virtualization

Friday, November 3rd, 2023

how-to-install-a-kvm-guest-os-from-the-commandline-easily-with-script-virt-install-logo

Earlier I've blogged in about howto build KVM Virtual Machine RHEL 8.3 Linux install on Redhat 8.3 Linux Hypervisor with custom tailored kickstart.cfg as one of the Projects i was involved in my work duty as system administrator, we had the task to build
a KVM virtual machines and build a High Availability Linux  PCS Corosync / Pacemaker / haproxy cluster on it to save some money for the company from purchasing VMWare licenses.

The setup of the KVM Virtual machine on a first glimpse is relatively simple and I thought, this can be done just for 2 / 3 days but it turned out to take up 2 weeks or so to properly prepare the kickstarter file and learn a bit about basic virt-install KVM options and experiment with them until
we can produce a noral working Virtual machines. 

The original developed simple script that I used to bring up a new KVM virtual machines in a bit easier way virt-install-kvm.sh looked like so:
 

#!/bin/sh
# Script to build a new VM based on a kickstart.cfg file template
# with virt-install
# Author Georgi Georgiev hip0
# hipo@Pc-freak.net

KS_FILE='kickstart.cfg';
VM_NAME='RHEL8_3-VirtualMachine';
VM_DESCR='CentOS 8.3 Virtual Machine';
RAM='8192';
CPUS='8';
# size is in Gigabytes
VM_IMG_SIZE='70';
ISO_LOCATION='/vmprivate/rhel-server-8.3-x86_64-dvd.iso';
VM_IMG_FILE_LOC='/vmprivate/RHEL8_3-VirtualMachine.img';

virt-install -n "$VMNAME" –description "$VM_DESCR" –os-type=Linux –os-variant=rhel8.3 –ram=8192 –vcpus=8 –location="$ISO_LOCATION" –disk path=$VM_IMG_FILE,bus=virtio,size=$IMG_VM_SIZE –graphics none –initrd-inject=/root/$KS_FILE –extra-args "console=ttyS0 ks=file:/$KS_FILE"

 

The script basicly did what it was aimed for but modifying the kickstarter ks.cfg every time with multiple additional parameters, like network configurations as well some additional modifications to ks.cfg for VM parameters was annoying.

Recently as we had to repeat the same task again in order to migrate old customer containing a Linux OpenVZ Virtual machines to a newer  OS installed RHEL versionb with KVM virtual machines, i've took some hours and scripted a small script that would easiy our task to build new KVM Virtual Machines
from scratch relatively easy, if we have to repeat the Linux OS build operation again and again.

Before proceeding to use the script of course one has to use LVM and setup the partitions on the Host which will be the Hypervisor where the KVM VMs will be installed as well you need to download an ISO image of Redhat Enterprise Linux / CentOS / Fedora or whatever kind of RPM based Linux
you would like to setup as well as do the basic configurations regarding the emulated Hardware node "Power" of the new Virtual Machine (CPU / Memory / Disk Partition) as well as choose a proper meaningful VM name (preferrably following some good meaningful crafted naming convention, that will talk a bit about what is inside the KVM VM container).

The script that automates a bit the KVM VM installation which you find below you can also download from here virt-install.sh

 

#!/bin/bash
virt_install_path=$(which virt-install);
vm_host='vmname01';
boot_iso='/vmprivate/rhel-8.7-x86_64-dvd.iso';
os_version='rhel8.7';
vm_cpus_count='4';
vm_ram='6144';
vm_img_location='/vmprivate/NAME-of-VM.img';
vm_machine_description='Name Production system';
ks_file_location_template='/root/ks.cfg.templ';
ks_file_location='/root/ks.cfg';
ks_vm_read_loc=$(echo $ks_file_location |sed -e 's#root/##g');
vm_main_ip='192.168.23.52';
vm_netmask='255.255.255.192';
vm_gateway='192.168.23.33';
vm_nameserver='172.30.50.2';
#–ip=192.168.233.52 –netmask=255.255.255.192 –gateway=192.168.233.33 –nameserver=172.20.88.2

echo "Checking if defined $vm_host image $vm_img_location is not already present";
if [ ! -f $vm_img_location ]; then
echo
else
echo 'Exiting $vm_img_location is present';
echo "To destroy exiting VM image $vm_img_location run manually cmds:";
echo 'virsh list; virsh destroy $vm_host; virsh undefine $vm_host –remove-all-storage';
exit 1;
fi

alias cp='cp -f';
/usr/bin/cp -rpf $ks_file_location_template $ks_file_location;

echo "Setting  VM Main IP: $vm_main_ip";
echo "Setting  VM netmask: $vm_netmask";
echo "Setting  VM Gateway: $vm_gateway";
echo "Setting  VM Nameserver $vm_nameserver";
/usr/bin/perl -pi -e "s/ip=a1.a2.a3.a4/ip=$vm_main_ip/" $ks_file_location
/usr/bin/perl -pi -e "s/netmask=b1.b2.b3.b4/netmask=$vm_netmask/" $ks_file_location
/usr/bin/perl -pi -e "s/gateway=c1.c2.c3.c4/nameserver=$vm_nameserver/" $ks_file_location
/usr/bin/perl -pi -e "s/nameserver=d1.d2.d3.d4/gateway=$vm_gateway/" $ks_file_location
/usr/bin/perl -pi -e "s/vm-hostname/$vm_host/" $ks_file_location

echo "Running VM install:";
echo $virt_install_path -n $vm_host –description "$vm_machine_description" –os-type=Linux –os-variant=$os_version –ram=$vm_ram –vcpus=$vm_cpus_count –location=$boot_iso –disk path=$vm_img_location,bus=virtio,size=90 –graphics none –initrd-inject=$ks_file_location –extra-args "console=ttyS0 ks=file:$ks_vm_read_loc"

$virt_install_path -n $vm_host –description "$vm_machine_description" –os-type=Linux –os-variant=$os_version –ram=$vm_ram –vcpus=$vm_cpus_count –location=$boot_iso –disk path=$vm_img_location,bus=virtio,size=90 –graphics none –initrd-inject=$ks_file_location –extra-args "console=ttyS0 ks=file:$ks_vm_read_loc"
 

 

Modify the script to insert the required parameters of the new VM in the script header session, you will have to provide below options.

 

vm_host='vmname01';
boot_iso='/vmprivate/rhel-8.7-x86_64-dvd.iso';
os_version='rhel8.7';
vm_cpus_count='4';
vm_ram='6144';
vm_img_location='/vmprivate/NAME-of-VM.img';
vm_machine_description='Name Production system';
ks_file_location_template='/root/ks.cfg.templ';
ks_file_location='/root/ks.cfg';
ks_vm_read_loc=$(echo $ks_file_location |sed -e 's#root/##g');
vm_main_ip='192.168.23.52';
vm_netmask='255.255.255.192';
vm_gateway='192.168.23.33';
vm_nameserver='172.30.50.2';

The automatic Build virtual machine script is tested and works with Redhat Enterprise Linux and of course is pretty primitive as there is so much available online that would do similar, but still I like it because it works for me 🙂

You will need also the ks.cfg.templ file which has the basic kickstarter configuration that will bring up the Virtual machines according to predefined configurations.
Here is the ks.cfg.templ and ks.cfg files as well, you will have to place them under /root/ or some other directory.
Of course this script is just a basic one, it can be easily updated to accept its variable options as arguments if you need to bring up a multitude of virtual machines relatively quickly with few minor modifications. 

Hope the script is helpful to some sysadmin out there. If so don't forget to donate me for a beer in my Patreon account found in the widgets section 🙂

How to set up Notify by email expiring local UNIX user accounts on Linux / BSD with a bash script

Thursday, August 24th, 2023

password-expiry-linux-tux-logo-script-picture-how-to-notify-if-password-expires-on-unix

If you have already configured Linux Local User Accounts Password Security policies Hardening – Set Password expiry, password quality, limit repatead access attempts, add directionary check, increase logged history command size and you want your configured local user accounts on a Linux / UNIX / BSD system to not expire before the user is reminded that it will be of his benefit to change his password on time, not to completely loose account to his account, then you might use a small script that is just checking the upcoming expiry for a predefined users and emails in an array with lslogins command like you will learn in this article.

The script below is written by a colleague Lachezar Pramatarov (Credit for the script goes to him) in order to solve this annoying expire problem, that we had all the time as me and colleagues often ended up with expired accounts and had to bother to ask for the password reset and even sometimes clearance of account locks. Hopefully this little script will help some other unix legacy admin systems to get rid of the account expire problem.

For the script to work you will need to have a properly configured SMTP (Mail server) with or without a relay to be able to send to the script predefined email addresses that will get notified. 

Here is example of a user whose account is about to expire in a couple of days and who will benefit of getting the Alert that he should hurry up to change his password until it is too late 🙂

[root@linux ~]# date
Thu Aug 24 17:28:18 CEST 2023

[root@server~]# chage -l lachezar
Last password change                                    : May 30, 2023
Password expires                                        : Aug 28, 2023
Password inactive                                       : never
Account expires                                         : never
Minimum number of days between password change          : 0
Maximum number of days between password change          : 90
Number of days of warning before password expires       : 14

Here is the user_passwd_expire.sh that will report the user

# vim  /usr/local/bin/user_passwd_expire.sh

#!/bin/bash

# This script will send warning emails for password expiration 
# on the participants in the following list:
# 20, 15, 10 and 0-7 days before expiration
# ! Script sends expiry Alert only if day is Wednesday – if (( $(date +%u)==3 )); !

# email to send if expiring
alert_email='alerts@pc-freak.net';
# the users that are admins added to belong to this group
admin_group="admins";
notify_email_header_customer_name='Customer Name';

declare -A mails=(
# list below accounts which will receive account expiry emails

# syntax to define uid / email
# [“account_name_from_etc_passwd”]="real_email_addr@fqdn";

#    [“abc”]="abc@fqdn.com"
#    [“cba”]="bca@fqdn.com"
    [“lachezar”]="lachezar.user@gmail.com"
    [“georgi”]="georgi@fqdn-mail.com"
    [“acct3”]="acct3@fqdn-mail.com"
    [“acct4”]="acct4@fqdn-mail.com"
    [“acct5”]="acct5@fqdn-mail.com"
    [“acct6”]="acct6@fqdn-mail.com"
#    [“acct7”]="acct7@fqdn-mail.com"
#    [“acct8”]="acct8@fqdn-mail.com"
#    [“acct9”]="acct9@fqdn-mail.com"
)

declare -A days

while IFS="=" read -r person day ; do
  days[“$person”]="$day"
done < <(lslogins –noheadings -o USER,GROUP,PWD-CHANGE,PWD-WARN,PWD-MIN,PWD-MAX,PWD-EXPIR,LAST-LOGIN,FAILED-LOGIN  –time-format=iso | awk '{print "echo "$1" "$2" "$3" $(((($(date +%s -d \""$3"+90 days\")-$(date +%s)))/86400)) "$5}' | /bin/bash | grep -E " $admin_group " | awk '{print $1 "=" $4}')

#echo ${days[laprext]}
for person in "${!mails[@]}"; do
     echo "$person ${days[$person]}";
     tmp=${days[$person]}

#     echo $tmp
# each person will receive mails only if 20th days / 15th days / 10th days remaining till expiry or if less than 7 days receive alert mail every day

     if  (( (${tmp}==20) || (${tmp}==15) || (${tmp}==10) || ((${tmp}>=0) && (${tmp}<=7)) )); 
     then
         echo "Hello, your password for $(hostname -s) will expire after ${days[$person]} days.” | mail -s “$notify_email_header_customer_name $(hostname -s) server password expiration”  -r passwd_expire ${mails[$person]};
     elif ((${tmp}<0));
     then
#          echo "The password for $person on $(hostname -s) has EXPIRED before{days[$person]} days. Please take an action ASAP.” | mail -s “EXPIRED password of  $person on $(hostname -s)”  -r EXPIRED ${mails[$person]};

# ==3 meaning day is Wednesday the day on which OnCall Person changes

        if (( $(date +%u)==3 ));
        then
             echo "The password for $person on $(hostname -s) has EXPIRED. Please take an action." | mail -s "EXPIRED password of  $person on $(hostname -s)"  -r EXPIRED $alert_email;
        fi
     fi  
done

 


To make the script notify about expiring user accounts, place the script under some directory lets say /usr/local/bin/user_passwd_expire.sh and make it executable and configure a cron job that will schedule it to run every now and then.

# cat /etc/cron.d/passwd_expire_cron

# /etc/cron.d/pwd_expire
#
# Check password expiration for users
#
# 2023-01-16 LPR
#
02 06 * * * root /usr/local/bin/user_passwd_expire.sh >/dev/null

Script will execute every day morning 06:02 by the cron job and if the day is wednesday (3rd day of week) it will send warning emails for password expiration if 20, 15, 10 days are left before account expires if only 7 days are left until the password of user acct expires, the script will start sending the Alarm every single day for 7th, 6th … 0 day until pwd expires.

If you don't have an expiring accounts and you want to force a specific account to have a expire date you can do it with:

# chage -E 2023-08-30 someuser


Or set it for new created system users with:

# useradd -e 2023-08-30 username


That's it the script will notify you on User PWD expiry.

If you need to for example set a single account to expire 90 days from now (3 months) that is a kind of standard password expiry policy admins use, do it with:

# date -d "90 days" +"%Y-%m-%d"
2023-11-22


Ideas for user_passwd_expire.sh script improvement
 

The downside of the script if you have too many local user accounts is you have to hardcode into it the username and user email_address attached to and that would be tedios task if you have 100+ accounts. 

However it is pretty easy if you already have a multitude of accounts in /etc/passwd that are from UID range to loop over them in a small shell loop and build new array from it. Of course for a solution like this to work you will have to have defined as user data as GECOS with command like chfn.
 

[georgi@server ~]$ chfn
Changing finger information for test.
Name [test]: 
Office []: georgi@fqdn-mail.com
Office Phone []: 
Home Phone []: 

Password: 

[root@server test]# finger georgi
Login: georgi                       Name: georgi
Directory: /home/georgi                   Shell: /bin/bash
Office: georgi@fqdn-mail.com
On since чт авг 24 17:41 (EEST) on :0 from :0 (messages off)
On since чт авг 24 17:43 (EEST) on pts/0 from :0
   2 seconds idle
On since чт авг 24 17:44 (EEST) on pts/1 from :0
   49 minutes 30 seconds idle
On since чт авг 24 18:04 (EEST) on pts/2 from :0
   32 minutes 42 seconds idle
New mail received пт окт 30 17:24 2020 (EET)
     Unread since пт окт 30 17:13 2020 (EET)
No Plan.

Then it should be relatively easy to add the GECOS for multilpe accounts if you have them predefined in a text file for each existing local user account.

Hope this script will help some sysadmin out there, many thanks to Lachezar for allowing me to share the script here.
Enjoy ! 🙂

How to monitor Haproxy Application server backends with Zabbix userparameter autodiscovery scripts

Friday, May 13th, 2022

zabbix-backend-monitoring-logo

Haproxy is doing quite a good job in High Availability tasks where traffic towards multiple backend servers has to be redirected based on the available one to sent data from the proxy to. 

Lets say haproxy is configured to proxy traffic for App backend machine1 and App backend machine2.

Usually in companies people configure a monitoring like with Icinga or Zabbix / Grafana to keep track on the Application server is always up and running. Sometimes however due to network problems (like burned Network Switch / router or firewall misconfiguration) or even an IP duplicate it might happen that Application server seems to be reporting reachable from some monotoring tool on it but unreachable from  Haproxy server -> App backend machine2 but reachable from App backend machine1. And even though haproxy will automatically switch on the traffic from backend machine2 to App machine1. It is a good idea to monitor and be aware that one of the backends is offline from the Haproxy host.
In this article I'll show you how this is possible by using 2 shell scripts and userparameter keys config through the autodiscovery zabbix legacy feature.
Assumably for the setup to work you will need to have as a minimum a Zabbix server installation of version 5.0 or higher.

1. Create the required  haproxy_discovery.sh  and haproxy_stats.sh scripts 

You will have to install the two scripts under some location for example we can put it for more clearness under /etc/zabbix/scripts

[root@haproxy-server1 ]# mkdir /etc/zabbix/scripts

[root@haproxy-server1 scripts]# vim haproxy_discovery.sh 
#!/bin/bash
#
# Get list of Frontends and Backends from HAPROXY
# Example: ./haproxy_discovery.sh [/var/lib/haproxy/stats] FRONTEND|BACKEND|SERVERS
# First argument is optional and should be used to set location of your HAPROXY socket
# Second argument is should be either FRONTEND, BACKEND or SERVERS, will default to FRONTEND if not set
#
# !! Make sure the user running this script has Read/Write permissions to that socket !!
#
## haproxy.cfg snippet
#  global
#  stats socket /var/lib/haproxy/stats  mode 666 level admin

HAPROXY_SOCK=""/var/run/haproxy/haproxy.sock
[ -n “$1” ] && echo $1 | grep -q ^/ && HAPROXY_SOCK="$(echo $1 | tr -d '\040\011\012\015')"

if [[ “$1” =~ (25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?):[0-9]{1,5} ]];
then
    HAPROXY_STATS_IP="$1"
    QUERYING_METHOD="TCP"
fi

QUERYING_METHOD="${QUERYING_METHOD:-SOCKET}"

query_stats() {
    if [[ ${QUERYING_METHOD} == “SOCKET” ]]; then
        echo "show stat" | socat ${HAPROXY_SOCK} stdio 2>/dev/null
    elif [[ ${QUERYING_METHOD} == “TCP” ]]; then
        echo "show stat" | nc ${HAPROXY_STATS_IP//:/ } 2>/dev/null
    fi
}

get_stats() {
        echo "$(query_stats)" | grep -v "^#"
}

[ -n “$2” ] && shift 1
case $1 in
        B*) END="BACKEND" ;;
        F*) END="FRONTEND" ;;
        S*)
                for backend in $(get_stats | grep BACKEND | cut -d, -f1 | uniq); do
                        for server in $(get_stats | grep "^${backend}," | grep -v BACKEND | grep -v FRONTEND | cut -d, -f2); do
                                serverlist="$serverlist,\n"'\t\t{\n\t\t\t"{#BACKEND_NAME}":"'$backend'",\n\t\t\t"{#SERVER_NAME}":"'$server'"}'
                        done
                done
                echo -e '{\n\t"data":[\n’${serverlist#,}’]}'
                exit 0
        ;;
        *) END="FRONTEND" ;;
esac

for frontend in $(get_stats | grep "$END" | cut -d, -f1 | uniq); do
    felist="$felist,\n"'\t\t{\n\t\t\t"{#'${END}'_NAME}":"'$frontend'"}'
done
echo -e '{\n\t"data":[\n’${felist#,}’]}'

 

[root@haproxy-server1 scripts]# vim haproxy_stats.sh 
#!/bin/bash
set -o pipefail

if [[ “$1” = /* ]]
then
  HAPROXY_SOCKET="$1"
  shift 0
else
  if [[ “$1” =~ (25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?):[0-9]{1,5} ]];
  then
    HAPROXY_STATS_IP="$1"
    QUERYING_METHOD="TCP"
    shift 1
  fi
fi

pxname="$1"
svname="$2"
stat="$3"

DEBUG=${DEBUG:-0}
HAPROXY_SOCKET="${HAPROXY_SOCKET:-/var/run/haproxy/haproxy.sock}"
QUERYING_METHOD="${QUERYING_METHOD:-SOCKET}"
CACHE_STATS_FILEPATH="${CACHE_STATS_FILEPATH:-/var/tmp/haproxy_stats.cache}"
CACHE_STATS_EXPIRATION="${CACHE_STATS_EXPIRATION:-1}" # in minutes
CACHE_INFO_FILEPATH="${CACHE_INFO_FILEPATH:-/var/tmp/haproxy_info.cache}" ## unused
CACHE_INFO_EXPIRATION="${CACHE_INFO_EXPIRATION:-1}" # in minutes ## unused
GET_STATS=${GET_STATS:-1} # when you update stats cache outsise of the script
SOCAT_BIN="$(which socat)"
NC_BIN="$(which nc)"
FLOCK_BIN="$(which flock)"
FLOCK_WAIT=15 # maximum number of seconds that "flock" waits for acquiring a lock
FLOCK_SUFFIX='.lock'
CUR_TIMESTAMP="$(date '+%s')"

debug() {
  [ “${DEBUG}” -eq 1 ] && echo "DEBUG: $@" >&2 || true
}

debug "SOCAT_BIN        => $SOCAT_BIN"
debug "NC_BIN           => $NC_BIN"
debug "FLOCK_BIN        => $FLOCK_BIN"
debug "FLOCK_WAIT       => $FLOCK_WAIT seconds"
debug "CACHE_FILEPATH   => $CACHE_FILEPATH"
debug "CACHE_EXPIRATION => $CACHE_EXPIRATION minutes"
debug "HAPROXY_SOCKET   => $HAPROXY_SOCKET"
debug "pxname   => $pxname"
debug "svname   => $svname"
debug "stat     => $stat"

# check if socat is available in path
if [ “$GET_STATS” -eq 1 ] && [[ $QUERYING_METHOD == “SOCKET” && -z “$SOCAT_BIN” ]] || [[ $QUERYING_METHOD == “TCP” &&  -z “$NC_BIN” ]]
then
  echo 'ERROR: cannot find socat binary'
  exit 126
fi

# if we are getting stats:
#   check if we can write to stats cache file, if it exists
#     or cache file path, if it does not exist
#   check if HAPROXY socket is writable
# if we are NOT getting stats:
#   check if we can read the stats cache file
if [ “$GET_STATS” -eq 1 ]
then
  if [ -e “$CACHE_FILEPATH” ] && [ ! -w “$CACHE_FILEPATH” ]
  then
    echo 'ERROR: stats cache file exists, but is not writable'
    exit 126
  elif [ ! -w ${CACHE_FILEPATH%/*} ]
  then
    echo 'ERROR: stats cache file path is not writable'
    exit 126
  fi
  if [[ $QUERYING_METHOD == “SOCKET” && ! -w $HAPROXY_SOCKET ]]
  then
    echo "ERROR: haproxy socket is not writable"
    exit 126
  fi
elif [ ! -r “$CACHE_FILEPATH” ]
then
  echo 'ERROR: cannot read stats cache file'
  exit 126
fi

# index:name:default
MAP="
1:pxname:@
2:svname:@
3:qcur:9999999999
4:qmax:0
5:scur:9999999999
6:smax:0
7:slim:0
8:stot:@
9:bin:9999999999
10:bout:9999999999
11:dreq:9999999999
12:dresp:9999999999
13:ereq:9999999999
14:econ:9999999999
15:eresp:9999999999
16:wretr:9999999999
17:wredis:9999999999
18:status:UNK
19:weight:9999999999
20:act:9999999999
21:bck:9999999999
22:chkfail:9999999999
23:chkdown:9999999999
24:lastchg:9999999999
25:downtime:0
26:qlimit:0
27:pid:@
28:iid:@
29:sid:@
30:throttle:9999999999
31:lbtot:9999999999
32:tracked:9999999999
33:type:9999999999
34:rate:9999999999
35:rate_lim:@
36:rate_max:@
37:check_status:@
38:check_code:@
39:check_duration:9999999999
40:hrsp_1xx:@
41:hrsp_2xx:@
42:hrsp_3xx:@
43:hrsp_4xx:@
44:hrsp_5xx:@
45:hrsp_other:@
46:hanafail:@
47:req_rate:9999999999
48:req_rate_max:@
49:req_tot:9999999999
50:cli_abrt:9999999999
51:srv_abrt:9999999999
52:comp_in:0
53:comp_out:0
54:comp_byp:0
55:comp_rsp:0
56:lastsess:9999999999
57:last_chk:@
58:last_agt:@
59:qtime:0
60:ctime:0
61:rtime:0
62:ttime:0
"

_STAT=$(echo -e "$MAP" | grep :${stat}:)
_INDEX=${_STAT%%:*}
_DEFAULT=${_STAT##*:}

debug "_STAT    => $_STAT"
debug "_INDEX   => $_INDEX"
debug "_DEFAULT => $_DEFAULT"

# check if requested stat is supported
if [ -z “${_STAT}” ]
then
  echo "ERROR: $stat is unsupported"
  exit 127
fi

# method to retrieve data from haproxy stats
# usage:
# query_stats "show stat"
query_stats() {
    if [[ ${QUERYING_METHOD} == “SOCKET” ]]; then
        echo $1 | socat ${HAPROXY_SOCKET} stdio 2>/dev/null
    elif [[ ${QUERYING_METHOD} == “TCP” ]]; then
        echo $1 | nc ${HAPROXY_STATS_IP//:/ } 2>/dev/null
    fi
}

# a generic cache management function, that relies on 'flock'
check_cache() {
  local cache_type="${1}"
  local cache_filepath="${2}"
  local cache_expiration="${3}"  
  local cache_filemtime
  cache_filemtime=$(stat -c '%Y' "${cache_filepath}" 2> /dev/null)
  if [ $((cache_filemtime+60*cache_expiration)) -ge ${CUR_TIMESTAMP} ]
  then
    debug "${cache_type} file found, results are at most ${cache_expiration} minutes stale.."
  elif "${FLOCK_BIN}" –exclusive –wait "${FLOCK_WAIT}" 200
  then
    cache_filemtime=$(stat -c '%Y' "${cache_filepath}" 2> /dev/null)
    if [ $((cache_filemtime+60*cache_expiration)) -ge ${CUR_TIMESTAMP} ]
    then
      debug "${cache_type} file found, results have just been updated by another process.."
    else
      debug "no ${cache_type} file found, querying haproxy"
      query_stats "show ${cache_type}" > "${cache_filepath}"
    fi
  fi 200> "${cache_filepath}${FLOCK_SUFFIX}"
}

# generate stats cache file if needed
get_stats() {
  check_cache 'stat' "${CACHE_STATS_FILEPATH}" ${CACHE_STATS_EXPIRATION}
}

# generate info cache file
## unused at the moment
get_info() {
  check_cache 'info' "${CACHE_INFO_FILEPATH}" ${CACHE_INFO_EXPIRATION}
}

# get requested stat from cache file using INDEX offset defined in MAP
# return default value if stat is ""
get() {
  # $1: pxname/svname
  local _res="$("${FLOCK_BIN}" –shared –wait "${FLOCK_WAIT}" "${CACHE_STATS_FILEPATH}${FLOCK_SUFFIX}" grep $1 "${CACHE_STATS_FILEPATH}")"
  if [ -z “${_res}” ]
  then
    echo "ERROR: bad $pxname/$svname"
    exit 127
  fi
  _res="$(echo $_res | cut -d, -f ${_INDEX})"
  if [ -z “${_res}” ] && [[ “${_DEFAULT}” != “@” ]]
  then
    echo "${_DEFAULT}"  
  else
    echo "${_res}"
  fi
}

# not sure why we'd need to split on backslash
# left commented out as an example to override default get() method
# status() {
#   get "^${pxname},${svnamem}," $stat | cut -d\  -f1
# }

# this allows for overriding default method of getting stats
# name a function by stat name for additional processing, custom returns, etc.
if type get_${stat} >/dev/null 2>&1
then
  debug "found custom query function"
  get_stats && get_${stat}
else
  debug "using default get() method"
  get_stats && get "^${pxname},${svname}," ${stat}
fi


! NB ! Substitute in the script /var/run/haproxy/haproxy.sock with your haproxy socket location

You can download the haproxy_stats.sh here and haproxy_discovery.sh here

2. Create the userparameter_haproxy_backend.conf

[root@haproxy-server1 zabbix_agentd.d]# cat userparameter_haproxy_backend.conf 
#
# Discovery Rule
#

# HAProxy Frontend, Backend and Server Discovery rules
UserParameter=haproxy.list.discovery[*],sudo /etc/zabbix/scripts/haproxy_discovery.sh SERVER
UserParameter=haproxy.stats[*],sudo /etc/zabbix/scripts/haproxy_stats.sh  $2 $3 $4

# support legacy way

UserParameter=haproxy.stat.downtime[*],sudo /etc/zabbix/scripts/haproxy_stats.sh  $2 $3 downtime

UserParameter=haproxy.stat.status[*],sudo /etc/zabbix/scripts/haproxy_stats.sh  $2 $3 status

UserParameter=haproxy.stat.last_chk[*],sudo /etc/zabbix/scripts/haproxy_stats.sh  $2 $3 last_chk

 

3. Create new simple template for the Application backend Monitoring and link it to monitored host

create-configuration-template-backend-monitoring

create-template-backend-monitoring-macros

 

Go to Configuration -> Hosts (find the host) and Link the template to it


4. Restart Zabbix-agent, in while check autodiscovery data is in Zabbix Server

[root@haproxy-server1 ]# systemctl restart zabbix-agent


Check in zabbix the userparameter data arrives, it should not be required to add any Items or Triggers as autodiscovery zabbix feature should automatically create in the server what is required for the data regarding backends to be in.

To view data arrives go to Zabbix config menus:

Configuration -> Hosts -> Hosts: (lookup for the haproxy-server1 hostname)

zabbix-discovery_rules-screenshot

The autodiscovery should have automatically created the following prototypes

zabbix-items-monitoring-prototypes
Now if you look inside Latest Data for the Host you should find some information like:

HAProxy Backend [backend1] (3 Items)
        
HAProxy Server [backend-name_APP/server1]: Connection Response
2022-05-13 14:15:04            History
        
HAProxy Server [backend-name/server2]: Downtime (hh:mm:ss)
2022-05-13 14:13:57    20:30:42        History
        
HAProxy Server [bk_name-APP/server1]: Status
2022-05-13 14:14:25    Up (1)        Graph
        ccnrlb01    HAProxy Backend [bk_CCNR_QA_ZVT] (3 Items)
        
HAProxy Server [bk_name-APP3/server1]: Connection Response
2022-05-13 14:15:05            History
        
HAProxy Server [bk_name-APP3/server1]: Downtime (hh:mm:ss)
2022-05-13 14:14:00    20:55:20        History
        
HAProxy Server [bk_name-APP3/server2]: Status
2022-05-13 14:15:08    Up (1)

To make alerting in case if a backend is down which usually you would like only left thing is to configure an Action to deliver alerts to some email address.

Get dmesg command kernel log report with human date / time timestamp on older Linux distributions

Friday, June 18th, 2021

how-to-get-dmesg-human-readable-timestamp-kernel-log-command-linux-logo

If you're a sysadmin you surely love to take a look at dmesg kernel log output. Usually on many Linux distributions there is a setup that dmesg keeps logging to log files /var/log/dmesg or /var/log/kern.log. But if you get some inherited old Linux servers it is quite possible that the previous machine maintainer did not enable the output of syslog to get logged in /var/log/{dmesg,kern.log,kernel.log}  or even have disabled the kernel log for some reason. Even though that in dmesg output you might find some interesting events reporting issues with Hard drives on its way to get broken / a bad / reads system processes crashing or whatever of other interesting information that could help you prevent severe servers downtimes or problems earlier but due to an old version of Linux distribution lets say Redhat 5 / Debian 6 or old CentOS / Fedora, the version of dmesg command shipped does not support the '-T' option that is present in util-linux package shipped with newer versions of  Redhat 7.X  / 8.X / SuSEs etc.  

 -T, –ctime
              Print human readable timestamps.  The timestamp could be inaccurate!

To illustrate better what I mean, here is an example from the non-human readable timestamp provided by older dmesg command

root@web-server~:# dmesg |tail -n 5
[4505913.361095] hid-generic 0003:1C4F:0002.000E: input,hidraw1: USB HID v1.10 Device [SIGMACHIP USB Keyboard] on usb-0000:00:1d.0-1.3/input1
[4558251.034024] Process accounting resumed
[4615396.191090] r8169 0000:03:00.0 eth1: Link is Down
[4615397.856950] r8169 0000:03:00.0 eth1: Link is Up – 100Mbps/Full – flow control rx/tx
[4644650.095723] Process accounting resumed

Thanksfully using below few lines of shell or perl scripts the dmesg -T  functionality could be added to the system , so you can easily get the proper timestamp out of the obscure default generated timestamp in the same manner as on newer distros.

Here is how to do with it with bash script:

#!/bin/sh paste in .bashrc and use dmesgt to get human readable timestamp
dmesg_with_human_timestamps () {
    FORMAT="%a %b %d %H:%M:%S %Y"

 

    now=$(date +%s)
    cputime_line=$(grep -m1 "\.clock" /proc/sched_debug)

    if [[ $cputime_line =~ [^0-9]*([0-9]*).* ]]; then
        cputime=$((BASH_REMATCH[1] / 1000))
    fi

    dmesg | while IFS= read -r line; do
        if [[ $line =~ ^\[\ *([0-9]+)\.[0-9]+\]\ (.*) ]]; then
            stamp=$((now-cputime+BASH_REMATCH[1]))
            echo "[$(date +”${FORMAT}” –date=@${stamp})] ${BASH_REMATCH[2]}"
        else
            echo "$line"
        fi
    done
}

Copy the script somewhere under lets say /usr/local/bin or wherever you like on the server and add into your HOME ~/.bashrc some alias like:
 

alias dmesgt=dmesg_with_timestamp.sh


You can get a copy dmesg_with_timestamp.sh of the script from here

Or you can use below few lines perl script to get the proper dmeg kernel date / time

 

#!/bin/perl
# on old Linux distros CentOS 6.0 etc. with dmesg (part of util-linux-ng-2.17.2-12.28.el6_9.2.x86_64) etc. dmesg -T not available
# workaround is little pl script below
dmesg_with_human_timestamps () {
    $(type -P dmesg) "$@" | perl -w -e 'use strict;
        my ($uptime) = do { local @ARGV="/proc/uptime";<>}; ($uptime) = ($uptime =~ /^(\d+)\./);
        foreach my $line (<>) {
            printf( ($line=~/^\[\s*(\d+)\.\d+\](.+)/) ? ( “[%s]%s\n", scalar localtime(time – $uptime + $1), $2 ) : $line )
        }'
}


Again to make use of the script put it under /usr/local/bin/check_dmesg_timestamp.pl

alias dmesgt=dmesg_with_human_timestamps

root@web-server:~# dmesgt | tail -n 20

[Sun Jun 13 15:51:49 2021] usb 2-1.3: USB disconnect, device number 9
[Sun Jun 13 15:51:50 2021] usb 2-1.3: new low-speed USB device number 10 using ehci-pci
[Sun Jun 13 15:51:50 2021] usb 2-1.3: New USB device found, idVendor=1c4f, idProduct=0002, bcdDevice= 1.10
[Sun Jun 13 15:51:50 2021] usb 2-1.3: New USB device strings: Mfr=1, Product=2, SerialNumber=0
[Sun Jun 13 15:51:50 2021] usb 2-1.3: Product: USB Keyboard
[Sun Jun 13 15:51:50 2021] usb 2-1.3: Manufacturer: SIGMACHIP
[Sun Jun 13 15:51:50 2021] input: SIGMACHIP USB Keyboard as /devices/pci0000:00/0000:00:1d.0/usb2/2-1/2-1.3/2-1.3:1.0/0003:1C4F:0002.000D/input/input25
[Sun Jun 13 15:51:50 2021] hid-generic 0003:1C4F:0002.000D: input,hidraw0: USB HID v1.10 Keyboard [SIGMACHIP USB Keyboard] on usb-0000:00:1d.0-1.3/input0
[Sun Jun 13 15:51:50 2021] input: SIGMACHIP USB Keyboard Consumer Control as /devices/pci0000:00/0000:00:1d.0/usb2/2-1/2-1.3/2-1.3:1.1/0003:1C4F:0002.000E/input/input26
[Sun Jun 13 15:51:50 2021] input: SIGMACHIP USB Keyboard System Control as /devices/pci0000:00/0000:00:1d.0/usb2/2-1/2-1.3/2-1.3:1.1/0003:1C4F:0002.000E/input/input27
[Sun Jun 13 15:51:50 2021] hid-generic 0003:1C4F:0002.000E: input,hidraw1: USB HID v1.10 Device [SIGMACHIP USB Keyboard] on usb-0000:00:1d.0-1.3/input1
[Mon Jun 14 06:24:08 2021] Process accounting resumed
[Mon Jun 14 22:16:33 2021] r8169 0000:03:00.0 eth1: Link is Down
[Mon Jun 14 22:16:34 2021] r8169 0000:03:00.0 eth1: Link is Up – 100Mbps/Full – flow control rx/tx

Send TCP / UDP strings and commands to Local and Remote Applications without netcat with Bash

Friday, July 24th, 2020

bash-open-network-tcp-udp-connections-from-shell-gnu-bash-shell-shell-logo

Did you ever needed to send TCP / UDP packets manually to send commands to local or remote applications, having a fully functional BASH Shell but not having the luxury to have NC (Netcat Swiss Army Knife of Networking) tool?
This happens if you have some Linux based embeded device as Arduino or a Linux server with a high security PCI requirement which can't affort to have Netcat in place or another portable hardware with a Linux kernel, that needs to communicate in UDP for any reason but you don't want to waste additional 28KB or physically you have access to a Linux device that doesn't have netcat but you want to be able to send UDP externally …

SInce some time in newer GNU Bash's releases support for TCP / UDP data sending is described in Bash's Manual and should be working it is not as good as you might expect but for a small things it could save you the day.

The syntax to use it is:
 

 /dev/protocol/IP/PORT


To open new socket connection to a UDP / TCP protocol with bash you have to simply open a new Shell handler (lets say 3) to:

 

/dev/tcp/your-url.com/80


or

/dev/udp/83.228.93.76/53

 

1. Get GOOGLE HTML Source with simple BASH / Getting URL Index with bash sockets


If you happen to have access to a machine where no network downloader tool or a text browser such as curl, wget, lynx, links is available but you want to dump the content of a index.html or any other URL with simply bash you can do it like so:
 

exec 3<>/dev/tcp/www.google.com/80
echo -e "GET / HTTP/1.1\r\n\r\n" >&3 

cat <&3

If you need to open a connection to a Internet Domain with bash and store the output into a separate .html file:

exec 3<>/dev/tcp/www.google.com/80
echo -e "GET / HTTP/1.1\r\n\r\n" >&3 

cat <&3 | tee -a output.html


Note that this will work only if you're logged into into an interactive shell.
If you want instead do it from a shell script (and omit usage) of wget etc. use something like bash_sockets_google_connect.sh basic script :

#!/bin/bash
exec 3<>/dev/tcp/www.google.com/80
printf "GET /\r\n\r\n" >&3
while IFS= read -r -u3 -t2 line || [[ -n “$line” ]]; do echo "$line"; done
exec 3>&-
exec 3<&-

 

2. Sending UDP protocol data via bash socket


To send test  variables or commands data to localhost (127.0.0.1) UDP listening service:
 

echo 'TEST COMMAND' > /dev/udp/127.0.0.1/538

 

echo "Any UDP data" > /dev/udp/127.0.0.1/3000


If you happent to have netcat or running on a bash shell that doesn't properly support TCP / UDP sending you can always do it netcat way:

echo "Command" | nc -u -w0 127.0.0.1 3000


Of course this little hack is useful just for simple things and eventually for more comlex stuff and scripting you would like to use a fully functional HTML reader ( W3C compliant Web Browser )
still  for a quick dirty stuff Bash socketing from the console rocks pretty much ! 🙂

 

Find when cron.daily cron.weekly and cron.monthly run on Redhat / CentOS / Debian Linux and systemd-timers

Wednesday, March 25th, 2020

Find-when-cron.daily-cron.monthly-cron.weekly-run-on-Redhat-CentOS-Debian-SuSE-SLES-Linux-cron-logo

 

The problem – Apache restart at random times


I've noticed today something that is occuring for quite some time but was out of my scope for quite long as I'm not directly involved in our Alert monitoring at my daily job as sys admin. Interestingly an Apache HTTPD webserver is triggering alarm twice a day for a short downtime that lasts for 9 seconds.

I've decided to investigate what is triggering WebServer restart in such random time and investigated on the system for any background running scripts as well as reviewed the system logs. As I couldn't find nothing there the only logical place to check was cron jobs.
The usual
 

crontab -u root -l


Had no configured cron jobbed scripts so I digged further to check whether there isn't cron jobs records for a script that is triggering the reload of Apache in /etc/crontab /var/spool/cron/root and /var/spool/cron/httpd.
Nothing was found there and hence as there was no anacron service running but /usr/sbin/crond the other expected place to look up for a trigger even was /etc/cron*

 

1. Configured default cron execution times, every day, every hour every month

 

# ls -ld /etc/cron.*
drwxr-xr-x 2 root root 4096 feb 27 10:54 /etc/cron.d/
drwxr-xr-x 2 root root 4096 dec 27 10:55 /etc/cron.daily/
drwxr-xr-x 2 root root 4096 dec  7 23:04 /etc/cron.hourly/
drwxr-xr-x 2 root root 4096 dec  7 23:04 /etc/cron.monthly/
drwxr-xr-x 2 root root 4096 dec  7 23:04 /etc/cron.weekly/

 

After a look up to each of above directories, finally I found the very expected logrorate shell script set to execute from /etc/cron.daily/logrotate and inside it I've found after the log files were set to be gzipped and moved to execute WebServer restart with:

systemctl reload httpd 

 

My first reaction was to ponder seriously why the script is invoking systemctl reload httpd instead of the good oldschool

apachectl -k graceful

 

But it seems on Redhat and CentOS since RHEL / CentOS version 6.X onwards systemctl reload httpd is supposed to be identical and a substitute for apachectl -k graceful.
Okay the craziness of innovation continued as obviously the reload was causing a Downtime to be visible in the Zabbix HTTPD port Monitoring graph …
Now as the problem was identified the other logical question poped up how to find out what is the exact timing scheduled to run the script in that unusual random times each time ??
 

2. Find out cron scripts timing Redhat / CentOS / Fedora / SLES

 

/etc/cron.{daily,monthly,weekly} placed scripts's execution method has changed over the years, causing a chaos just like many Linux standard things we know due to the inclusion of systemd and some other additional weird OS design changes. The result is the result explained above scripts are running at a strange unexpeted times … one thing that was intruduced was anacron – which is also executing commands periodically with a different preset frequency. However it is considered more thrustworhty by crond daemon, because anacron does not assume the machine is continuosly running and if the machine is down due to a shutdown or a failure (if it is a Virtual Machine) or simply a crond dies out, some cronjob necessery for overall set environment or application might not run, what anacron guarantees is even though that and even if crond is in unworking defunct state, the preset scheduled scripts will still be served.
anacron's default file location is in /etc/anacrontab.

A standard /etc/anacrontab looks like so:
 

[root@centos ~]:# cat /etc/anacrontab
# /etc/anacrontab: configuration file for anacron
 
# See anacron(8) and anacrontab(5) for details.
 
SHELL=/bin/sh
PATH=/sbin:/bin:/usr/sbin:/usr/bin
MAILTO=root
# the maximal random delay added to the base delay of the jobs
RANDOM_DELAY=45
# the jobs will be started during the following hours only
START_HOURS_RANGE=3-22
 
#period in days   delay in minutes   job-identifier   command
1    5    cron.daily        nice run-parts /etc/cron.daily
7    25    cron.weekly        nice run-parts /etc/cron.weekly
@monthly 45    cron.monthly        nice run-parts /etc/cron.monthly

 

START_HOURS_RANGE : The START_HOURS_RANGE variable sets the time frame, when the job could started. 
The jobs will start during the 3-22 (3AM-10PM) hours only.

  • cron.daily will run at 3:05 (After Midnight) A.M. i.e. run once a day at 3:05AM.
  • cron.weekly will run at 3:25 AM i.e. run once a week at 3:25AM.
  • cron.monthly will run at 3:45 AM i.e. run once a month at 3:45AM.

If the RANDOM_DELAY env var. is set, a random value between 0 and RANDOM_DELAY minutes will be added to the start up delay of anacron served jobs. 
For instance RANDOM_DELAY equels 45 would therefore add, randomly, between 0 and 45 minutes to the user defined delay. 

Delay will be 5 minutes + RANDOM_DELAY for cron.daily for above cron.daily, cron.weekly, cron.monthly config records, i.e. 05:01 + 0-45 minutes

A full detailed explanation on automating system tasks on Redhat Enterprise Linux is worthy reading here.

!!! Note !!! that listed jobs will be running in queue. After one finish, then next will start.
 

3. SuSE Enterprise Linux cron jobs not running at desired times why?


in SuSE it is much more complicated to have a right timing for standard default cron jobs that comes preinstalled with a service 

In older SLES release /etc/crontab looked like so:

 

SHELL=/bin/bash
PATH=/sbin:/bin:/usr/sbin:/usr/bin
MAILTO=root
HOME=/

# run-parts
01 * * * * root run-parts /etc/cron.hourly
02 4 * * * root run-parts /etc/cron.daily
22 4 * * 0 root run-parts /etc/cron.weekly
42 4 1 * * root run-parts /etc/cron.monthly


As time of writting article it looks like:

 

SHELL=/bin/sh
PATH=/usr/bin:/usr/sbin:/sbin:/bin:/usr/lib/news/bin
MAILTO=root
#
# check scripts in cron.hourly, cron.daily, cron.weekly, and cron.monthly
#
-*/15 * * * *   root  test -x /usr/lib/cron/run-crons && /usr/lib/cron/run-crons >/dev/null 2>&1

 

 


This runs any scripts placed in /etc/cron.{hourly, daily, weekly, monthly} but it may not run them when you expect them to run. 
/usr/lib/cron/run-crons compares the current time to the /var/spool/cron/lastrun/cron.{time} file to determine if those jobs need to be run.

For hourly, it checks if the current time is greater than (or exactly) 60 minutes past the timestamp of the /var/spool/cron/lastrun/cron.hourly file.

For weekly, it checks if the current time is greater than (or exactly) 10080 minutes past the timestamp of the /var/spool/cron/lastrun/cron.weekly file.

Monthly uses a caclucation to check the time difference, but is the same type of check to see if it has been one month after the last run.

Daily has a couple variations available – By default it checks if it is more than or exactly 1440 minutes since lastrun.
If DAILY_TIME is set in the /etc/sysconfig/cron file (again a suse specific innovation), then that is the time (within 15minutes) when daily will run.

For systems that are powered off at DAILY_TIME, daily tasks will run at the DAILY_TIME, unless it has been more than x days, if it is, they run at the next running of run-crons. (default 7days, can set shorter time in /etc/sysconfig/cron.)
Because of these changes, the first time you place a job in one of the /etc/cron.{time} directories, it will run the next time run-crons runs, which is at every 15mins (xx:00, xx:15, xx:30, xx:45) and that time will be the lastrun, and become the normal schedule for future runs. Note that there is the potential that your schedules will begin drift by 15minute increments.

As you see this is very complicated stuff and since God is in the simplicity it is much better to just not use /etc/cron.* for whatever scripts and manually schedule each of the system cron jobs and custom scripts with cron at specific times.


4. Debian Linux time start schedule for cron.daily / cron.monthly / cron.weekly timing

As the last many years many of the servers I've managed were running Debian GNU / Linux, my first place to check was /etc/crontab which is the standard cronjobs file that is setting the { daily , monthly , weekly crons } 

 

 debian:~# ls -ld /etc/cron.*
drwxr-xr-x 2 root root 4096 фев 27 10:54 /etc/cron.d/
drwxr-xr-x 2 root root 4096 фев 27 10:55 /etc/cron.daily/
drwxr-xr-x 2 root root 4096 дек  7 23:04 /etc/cron.hourly/
drwxr-xr-x 2 root root 4096 дек  7 23:04 /etc/cron.monthly/
drwxr-xr-x 2 root root 4096 дек  7 23:04 /etc/cron.weekly/

 

debian:~# cat /etc/crontab 
# /etc/crontab: system-wide crontab
# Unlike any other crontab you don't have to run the `crontab'
# command to install the new version when you edit this file
# and files in /etc/cron.d. These files also have username fields,
# that none of the other crontabs do.

SHELL=/bin/sh
PATH=/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin# Example of job definition:
# .—————- minute (0 – 59)
# |  .————- hour (0 – 23)
# |  |  .———- day of month (1 – 31)
# |  |  |  .——- month (1 – 12) OR jan,feb,mar,apr …
# |  |  |  |  .—- day of week (0 – 6) (Sunday=0 or 7) OR sun,mon,tue,wed,thu,fri,sat
# |  |  |  |  |
# *  *  *  *  * user-name command to be executed
17 *    * * *    root    cd / && run-parts –report /etc/cron.hourly
25 6    * * *    root    test -x /usr/sbin/anacron || ( cd / && run-parts –report /etc/cron.daily )
47 6    * * 7    root    test -x /usr/sbin/anacron || ( cd / && run-parts –report /etc/cron.weekly )
52 6    1 * *    root    test -x /usr/sbin/anacron || ( cd / && run-parts –report /etc/cron.monthly )

What above does is:

– Run cron.hourly once at every hour at 1:17 am
– Run cron.daily once at every day at 6:25 am.
– Run cron.weekly once at every day at 6:47 am.
– Run cron.monthly once at every day at 6:42 am.

As you can see if anacron is present on the system it is run via it otherwise it is run via run-parts binary command which is reading and executing one by one all scripts insude /etc/cron.hourly, /etc/cron.weekly , /etc/cron.mothly

anacron – few more words

Anacron is the canonical way to run at least the jobs from /etc/cron.{daily,weekly,monthly) after startup, even when their execution was missed because the system was not running at the given time. Anacron does not handle any cron jobs from /etc/cron.d, so any package that wants its /etc/cron.d cronjob being executed by anacron needs to take special measures.

If anacron is installed, regular processing of the /etc/cron.d{daily,weekly,monthly} is omitted by code in /etc/crontab but handled by anacron via /etc/anacrontab. Anacron's execution of these job lists has changed multiple times in the past:

debian:~# cat /etc/anacrontab 
# /etc/anacrontab: configuration file for anacron

# See anacron(8) and anacrontab(5) for details.

SHELL=/bin/sh
PATH=/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin
HOME=/root
LOGNAME=root

# These replace cron's entries
1    5    cron.daily    run-parts –report /etc/cron.daily
7    10    cron.weekly    run-parts –report /etc/cron.weekly
@monthly    15    cron.monthly    run-parts –report /etc/cron.monthly

In wheezy and earlier, anacron is executed via init script on startup and via /etc/cron.d at 07:30. This causes the jobs to be run in order, if scheduled, beginning at 07:35. If the system is rebooted between midnight and 07:35, the jobs run after five minutes of uptime.
In stretch, anacron is executed via a systemd timer every hour, including the night hours. This causes the jobs to be run in order, if scheduled, beween midnight and 01:00, which is a significant change to the previous behavior.
In buster, anacron is executed via a systemd timer every hour with the exception of midnight to 07:00 where anacron is not invoked. This brings back a bit of the old timing, with the jobs to be run in order, if scheduled, beween 07:00 and 08:00. Since anacron is also invoked once at system startup, a reboot between midnight and 08:00 also causes the jobs to be scheduled after five minutes of uptime.
anacron also didn't have an upstream release in nearly two decades and is also currently orphaned in Debian.

As of 2019-07 (right after buster's release) it is planned to have cron and anacron replaced by cronie.

cronie – Cronie was forked by Red Hat from ISC Cron 4.1 in 2007, is the default cron implementation in Fedora and Red Hat Enterprise Linux at least since Version 6. cronie seems to have an acive upstream, but is currently missing some of the things that Debian has added to vixie cron over the years. With the finishing of cron's conversion to quilt (3.0), effort can begin to add the Debian extensions to Vixie cron to cronie.

Because cronie doesn't have all the Debian extensions yet, it is not yet suitable as a cron replacement, so it is not in Debian.
 

5. systemd-timers – The new crazy systemd stuff for script system job scheduling


Timers are systemd unit files with a suffix of .timer. systemd-timers was introduced with systemd so older Linux OS-es does not have it.
 Timers are like other unit configuration files and are loaded from the same paths but include a [Timer] section which defines when and how the timer activates. Timers are defined as one of two types:

 

  • Realtime timers (a.k.a. wallclock timers) activate on a calendar event, the same way that cronjobs do. The option OnCalendar= is used to define them.
  • Monotonic timers activate after a time span relative to a varying starting point. They stop if the computer is temporarily suspended or shut down. There are number of different monotonic timers but all have the form: OnTypeSec=. Common monotonic timers include OnBootSec and OnActiveSec.

     

     

    For each .timer file, a matching .service file exists (e.g. foo.timer and foo.service). The .timer file activates and controls the .service file. The .service does not require an [Install] section as it is the timer units that are enabled. If necessary, it is possible to control a differently-named unit using the Unit= option in the timer’s [Timer] section.

    systemd-timers is a complex stuff and I'll not get into much details but the idea was to give awareness of its existence for more info check its manual man systemd.timer

Its most basic use is to list all configured systemd.timers, below is from my home Debian laptop
 

debian:~# systemctl list-timers –all
NEXT                         LEFT         LAST                         PASSED       UNIT                         ACTIVATES
Tue 2020-03-24 23:33:58 EET  18s left     Tue 2020-03-24 23:31:28 EET  2min 11s ago laptop-mode.timer            lmt-poll.service
Tue 2020-03-24 23:39:00 EET  5min left    Tue 2020-03-24 23:09:01 EET  24min ago    phpsessionclean.timer        phpsessionclean.service
Wed 2020-03-25 00:00:00 EET  26min left   Tue 2020-03-24 00:00:01 EET  23h ago      logrotate.timer              logrotate.service
Wed 2020-03-25 00:00:00 EET  26min left   Tue 2020-03-24 00:00:01 EET  23h ago      man-db.timer                 man-db.service
Wed 2020-03-25 02:38:42 EET  3h 5min left Tue 2020-03-24 13:02:01 EET  10h ago      apt-daily.timer              apt-daily.service
Wed 2020-03-25 06:13:02 EET  6h left      Tue 2020-03-24 08:48:20 EET  14h ago      apt-daily-upgrade.timer      apt-daily-upgrade.service
Wed 2020-03-25 07:31:57 EET  7h left      Tue 2020-03-24 23:30:28 EET  3min 11s ago anacron.timer                anacron.service
Wed 2020-03-25 17:56:01 EET  18h left     Tue 2020-03-24 17:56:01 EET  5h 37min ago systemd-tmpfiles-clean.timer systemd-tmpfiles-clean.service

 

8 timers listed.


N ! B! If a timer gets out of sync, it may help to delete its stamp-* file in /var/lib/systemd/timers (or ~/.local/share/systemd/ in case of user timers). These are zero length files which mark the last time each timer was run. If deleted, they will be reconstructed on the next start of their timer.

Summary

In this article, I've shortly explain logic behind debugging weird restart events etc. of Linux configured services such as Apache due to configured scripts set to run with a predefined scheduled job timing. I shortly explained on how to figure out why the preset default install configured cron jobs such as logrorate – the service that is doing system logs archiving and nulling run at a certain time. I shortly explained the mechanism behind cron.{daily, monthy, weekly} and its execution via anacron – runner program similar to crond that never misses to run a scheduled job even if a system downtime occurs due to a crashed Docker container etc. run-parts command's use was shortly explained. A short look at systemd.timers was made which is now essential part of almost every new Linux release and often used by system scripts for scheduling time based maintainance tasks.

How to build Linux logging bash shell script write_log, logging with Named Pipe buffer, Simple Linux common log files logging with logger command

Monday, August 26th, 2019

how-to-build-bash-script-for-logging-buffer-named-pipes-basic-common-files-logging-with-logger-command

Logging into file in GNU / Linux and FreeBSD is as simple as simply redirecting the output, e.g.:
 

echo "$(date) Whatever" >> /home/hipo/log/output_file_log.txt


or with pyping to tee command

 

echo "$(date) Service has Crashed" | tee -a /home/hipo/log/output_file_log.txt


But what if you need to create a full featured logging bash robust shell script function that will run as a daemon continusly as a background process and will output
all content from itself to an external log file?
In below article, I've given example logging script in bash, as well as small example on how a specially crafted Named Pipe buffer can be used that will later store to a file of choice.
Finally I found it interesting to mention few words about logger command which can be used to log anything to many of the common / general Linux log files stored under /var/log/ – i.e. /var/log/syslog /var/log/user /var/log/daemon /var/log/mail etc.
 

1. Bash script function for logging write_log();


Perhaps the simplest method is just to use a small function routine in your shell script like this:
 

write_log()
LOG_FILE='/root/log.txt';
{
  while read text
  do
      LOGTIME=`date "+%Y-%m-%d %H:%M:%S"`
      # If log file is not defined, just echo the output
      if [ “$LOG_FILE” == “” ]; then
    echo $LOGTIME": $text";
      else
        LOG=$LOG_FILE.`date +%Y%m%d`
    touch $LOG
        if [ ! -f $LOG ]; then echo "ERROR!! Cannot create log file $LOG. Exiting."; exit 1; fi
    echo $LOGTIME": $text" | tee -a $LOG;
      fi
  done
}

 

  •  Using the script from within itself or from external to write out to defined log file

 

echo "Skipping to next copy" | write_log

 

2. Use Unix named pipes to pass data – Small intro on what is Unix Named Pipe.


Named Pipe –  a named pipe (also known as a FIFO (First In First Out) for its behavior) is an extension to the traditional pipe concept on Unix and Unix-like systems, and is one of the methods of inter-process communication (IPC). The concept is also found in OS/2 and Microsoft Windows, although the semantics differ substantially. A traditional pipe is "unnamed" and lasts only as long as the process. A named pipe, however, can last as long as the system is up, beyond the life of the process. It can be deleted if no longer used.
Usually a named pipe appears as a file, and generally processes attach to it for IPC.

 

Once named pipes were shortly explained for those who hear it for a first time, its time to say named pipe in unix / linux is created with mkfifo command, syntax is straight foward:
 

mkfifo /tmp/name-of-named-pipe


Some older Linux-es with older bash and older bash shell scripts were using mknod.
So idea behind logging script is to use a simple named pipe read input and use date command to log the exact time the command was executed, here is the script.

 

#!/bin/bash
named_pipe='/tmp/output-named-pipe';
output_named_log='
/tmp/output-named-log.txt ';

if [ -p $named_pipe ]; then
rm -f $named_pipe
fi
mkfifo $named_pipe

while true; do
read LINE <$named_pipe
echo $(date): "$LINE" >>/tmp/output-named-log.txt
done


To write out any other script output and get logged now, any of your output with a nice current date command generated output write out any output content to the loggin buffer like so:

 

echo 'Using Named pipes is so cool' > /tmp/output-named-pipe
echo 'Disk is full on a trigger' > /tmp/output-named-pipe

  • Getting the output with the date timestamp

# cat /tmp/output-named-log.txt
Mon Aug 26 15:21:29 EEST 2019: Using Named pipes is so cool
Mon Aug 26 15:21:54 EEST 2019: Disk is full on a trigger


If you wonder why it is better to use Named pipes for logging, they perform better (are generally quicker) than Unix sockets.

 

3. Logging files to system log files with logger

 

If you need to do a one time quick way to log any message of your choice with a standard Logging timestamp, take a look at logger (a part of bsdutils Linux package), and is a command which is used to enter messages into the system log, to use it simply invoke it with a message and it will log your specified output by default to /var/log/syslog common logfile

 

root@linux:/root# logger 'Here we go, logging'
root@linux:/root # tail -n 3 /var/log/syslog
Aug 26 15:41:01 localhost CRON[24490]: (root) CMD (chown qscand:qscand -R /var/run/clamav/ 2>&1 >/dev/null)
Aug 26 15:42:01 localhost CRON[24547]: (root) CMD (chown qscand:qscand -R /var/run/clamav/ 2>&1 >/dev/null)
Aug 26 15:42:20 localhost hipo: Here we go, logging

 

If you have took some time to read any of the init.d scripts on Debian / Fedora / RHEL / CentOS Linux etc. you will notice the logger logging facility is heavily used.

With logger you can print out message with different priorities (e.g. if you want to write an error message to mail.* logs), you can do so with:
 

 logger -i -p mail.err "Output of mail processing script"


To log a normal non-error (priority message) with logger to /var/log/mail.log system log.

 

 logger -i -p mail.notice "Output of mail processing script"


A whole list of supported facility named priority valid levels by logger (as taken of its current Linux manual) are as so:

 

FACILITIES AND LEVELS
       Valid facility names are:

              auth
              authpriv   for security information of a sensitive nature
              cron
              daemon
              ftp
              kern       cannot be generated from userspace process, automatically converted to user
              lpr
              mail
              news
              syslog
              user
              uucp
              local0
                to
              local7
              security   deprecated synonym for auth

       Valid level names are:

              emerg
              alert
              crit
              err
              warning
              notice
              info
              debug
              panic     deprecated synonym for emerg
              error     deprecated synonym for err
              warn      deprecated synonym for warning

       For the priority order and intended purposes of these facilities and levels, see syslog(3).

 


If you just want to log to Linux main log file (be it /var/log/syslog or /var/log/messages), depending on the Linux distribution, just type', even without any shell quoting:

 

logger 'The reason to reboot the server Currently was a System security Update

 

So what others is logger useful for?

 In addition to being a good diagnostic tool, you can use logger to test if all basic system logs with its respective priorities work as expected, this is especially
useful as I've seen on a Cloud Holsted OpenXEN based servers as a SAP consultant, that sometimes logging to basic log files stops to log for months or even years due to
syslog and syslog-ng problems hungs by other thirt party scripts and programs.
To test test all basic logging and priority on system logs as expected use the following logger-test-all-basic-log-logging-facilities.sh shell script.

 

#!/bin/bash
for i in {auth,auth-priv,cron,daemon,kern, \
lpr,mail,mark,news,syslog,user,uucp,local0 \
,local1,local2,local3,local4,local5,local6,local7}

do        
# (this is all one line!)

 

for k in {debug,info,notice,warning,err,crit,alert,emerg}
do

logger -p $i.$k "Test daemon message, facility $i priority $k"

done

done

Note that on different Linux distribution verions, the facility and priority names might differ so, if you get

logger: unknown facility name: {auth,auth-priv,cron,daemon,kern,lpr,mail,mark,news, \
syslog,user,uucp,local0,local1,local2,local3,local4, \
local5,local6,local7}

check and set the proper naming as described in logger man page.

 

4. Using a file descriptor that will output to a pre-set log file


Another way is to add the following code to the beginning of the script

#!/bin/bash
exec 3>&1 4>&2
trap 'exec 2>&4 1>&3' 0 1 2 3
exec 1>log.out 2>&1
# Everything below will go to the file 'log.out':

The code Explaned

  •     Saves file descriptors so they can be restored to whatever they were before redirection or used themselves to output to whatever they were before the following redirect.
    trap 'exec 2>&4 1>&3' 0 1 2 3
  •     Restore file descriptors for particular signals. Not generally necessary since they should be restored when the sub-shell exits.

          exec 1>log.out 2>&1

  •     Redirect stdout to file log.out then redirect stderr to stdout. Note that the order is important when you want them going to the same file. stdout must be redirected before stderr is redirected to stdout.

From then on, to see output on the console (maybe), you can simply redirect to &3. For example
,

echo "$(date) : Do print whatever you want logging to &3 file handler" >&3


I've initially found out about this very nice bash code from serverfault.com's post how can I fully log all bash script actions (but unfortunately on latest Debian 10 Buster Linux  that is prebundled with bash shell 5.0.3(1)-release the code doesn't behave exactly, well but still on older bash versions it works fine.

Sum it up


To shortlysummarize there is plenty of ways to do logging from a shell script logger command but using a function or a named pipe is the most classic. Sometimes if a script is supposed to write user or other script output to a a common file such as syslog, logger command can be used as it is present across most modern Linux distros.
If you have a better ways, please drop a common and I'll add it to this article.