Archive for the ‘Linux’ Category

Find when cron.daily cron.weekly and cron.monthly run on Redhat / CentOS / Debian Linux and systemd-timers

Wednesday, March 25th, 2020



The problem – Apache restart at random times

I've noticed today something that is occuring for quite some time but was out of my scope for quite long as I'm not directly involved in our Alert monitoring at my daily job as sys admin. Interestingly an Apache HTTPD webserver is triggering alarm twice a day for a short downtime that lasts for 9 seconds.

I've decided to investigate what is triggering WebServer restart in such random time and investigated on the system for any background running scripts as well as reviewed the system logs. As I couldn't find nothing there the only logical place to check was cron jobs.
The usual

crontab -u root -l

Had no configured cron jobbed scripts so I digged further to check whether there isn't cron jobs records for a script that is triggering the reload of Apache in /etc/crontab /var/spool/cron/root and /var/spool/cron/httpd.
Nothing was found there and hence as there was no anacron service running but /usr/sbin/crond the other expected place to look up for a trigger even was /etc/cron*


1. Configured default cron execution times, every day, every hour every month


# ls -ld /etc/cron.*
drwxr-xr-x 2 root root 4096 feb 27 10:54 /etc/cron.d/
drwxr-xr-x 2 root root 4096 dec 27 10:55 /etc/cron.daily/
drwxr-xr-x 2 root root 4096 dec  7 23:04 /etc/cron.hourly/
drwxr-xr-x 2 root root 4096 dec  7 23:04 /etc/cron.monthly/
drwxr-xr-x 2 root root 4096 dec  7 23:04 /etc/cron.weekly/


After a look up to each of above directories, finally I found the very expected logrorate shell script set to execute from /etc/cron.daily/logrotate and inside it I've found after the log files were set to be gzipped and moved to execute WebServer restart with:

systemctl reload httpd 


My first reaction was to ponder seriously why the script is invoking systemctl reload httpd instead of the good oldschool

apachectl -k graceful


But it seems on Redhat and CentOS since RHEL / CentOS version 6.X onwards systemctl reload httpd is supposed to be identical and a substitute for apachectl -k graceful.
Okay the craziness of innovation continued as obviously the reload was causing a Downtime to be visible in the Zabbix HTTPD port Monitoring graph …
Now as the problem was identified the other logical question poped up how to find out what is the exact timing scheduled to run the script in that unusual random times each time ??

2. Find out cron scripts timing Redhat / CentOS / Fedora / SLES


/etc/cron.{daily,monthly,weekly} placed scripts's execution method has changed over the years, causing a chaos just like many Linux standard things we know due to the inclusion of systemd and some other additional weird OS design changes. The result is the result explained above scripts are running at a strange unexpeted times … one thing that was intruduced was anacron – which is also executing commands periodically with a different preset frequency. However it is considered more thrustworhty by crond daemon, because anacron does not assume the machine is continuosly running and if the machine is down due to a shutdown or a failure (if it is a Virtual Machine) or simply a crond dies out, some cronjob necessery for overall set environment or application might not run, what anacron guarantees is even though that and even if crond is in unworking defunct state, the preset scheduled scripts will still be served.
anacron's default file location is in /etc/anacrontab.

A standard /etc/anacrontab looks like so:

[root@centos ~]:# cat /etc/anacrontab
# /etc/anacrontab: configuration file for anacron
# See anacron(8) and anacrontab(5) for details.
# the maximal random delay added to the base delay of the jobs
# the jobs will be started during the following hours only
#period in days   delay in minutes   job-identifier   command
1    5    cron.daily        nice run-parts /etc/cron.daily
7    25    cron.weekly        nice run-parts /etc/cron.weekly
@monthly 45    cron.monthly        nice run-parts /etc/cron.monthly


START_HOURS_RANGE : The START_HOURS_RANGE variable sets the time frame, when the job could started. 
The jobs will start during the 3-22 (3AM-10PM) hours only.

  • cron.daily will run at 3:05 (After Midnight) A.M. i.e. run once a day at 3:05AM.
  • cron.weekly will run at 3:25 AM i.e. run once a week at 3:25AM.
  • cron.monthly will run at 3:45 AM i.e. run once a month at 3:45AM.

If the RANDOM_DELAY env var. is set, a random value between 0 and RANDOM_DELAY minutes will be added to the start up delay of anacron served jobs. 
For instance RANDOM_DELAY equels 45 would therefore add, randomly, between 0 and 45 minutes to the user defined delay. 

Delay will be 5 minutes + RANDOM_DELAY for cron.daily for above cron.daily, cron.weekly, cron.monthly config records, i.e. 05:01 + 0-45 minutes

A full detailed explanation on automating system tasks on Redhat Enterprise Linux is worthy reading here.

!!! Note !!! that listed jobs will be running in queue. After one finish, then next will start.

3. SuSE Enterprise Linux cron jobs not running at desired times why?

in SuSE it is much more complicated to have a right timing for standard default cron jobs that comes preinstalled with a service 

In older SLES release /etc/crontab looked like so:



# run-parts
01 * * * * root run-parts /etc/cron.hourly
02 4 * * * root run-parts /etc/cron.daily
22 4 * * 0 root run-parts /etc/cron.weekly
42 4 1 * * root run-parts /etc/cron.monthly

As time of writting article it looks like:


# check scripts in cron.hourly, cron.daily, cron.weekly, and cron.monthly
-*/15 * * * *   root  test -x /usr/lib/cron/run-crons && /usr/lib/cron/run-crons >/dev/null 2>&1



This runs any scripts placed in /etc/cron.{hourly, daily, weekly, monthly} but it may not run them when you expect them to run. 
/usr/lib/cron/run-crons compares the current time to the /var/spool/cron/lastrun/cron.{time} file to determine if those jobs need to be run.

For hourly, it checks if the current time is greater than (or exactly) 60 minutes past the timestamp of the /var/spool/cron/lastrun/cron.hourly file.

For weekly, it checks if the current time is greater than (or exactly) 10080 minutes past the timestamp of the /var/spool/cron/lastrun/cron.weekly file.

Monthly uses a caclucation to check the time difference, but is the same type of check to see if it has been one month after the last run.

Daily has a couple variations available – By default it checks if it is more than or exactly 1440 minutes since lastrun.
If DAILY_TIME is set in the /etc/sysconfig/cron file (again a suse specific innovation), then that is the time (within 15minutes) when daily will run.

For systems that are powered off at DAILY_TIME, daily tasks will run at the DAILY_TIME, unless it has been more than x days, if it is, they run at the next running of run-crons. (default 7days, can set shorter time in /etc/sysconfig/cron.)
Because of these changes, the first time you place a job in one of the /etc/cron.{time} directories, it will run the next time run-crons runs, which is at every 15mins (xx:00, xx:15, xx:30, xx:45) and that time will be the lastrun, and become the normal schedule for future runs. Note that there is the potential that your schedules will begin drift by 15minute increments.

As you see this is very complicated stuff and since God is in the simplicity it is much better to just not use /etc/cron.* for whatever scripts and manually schedule each of the system cron jobs and custom scripts with cron at specific times.

4. Debian Linux time start schedule for cron.daily / cron.monthly / cron.weekly timing

As the last many years many of the servers I've managed were running Debian GNU / Linux, my first place to check was /etc/crontab which is the standard cronjobs file that is setting the { daily , monthly , weekly crons } 


 debian:~# ls -ld /etc/cron.*
drwxr-xr-x 2 root root 4096 фев 27 10:54 /etc/cron.d/
drwxr-xr-x 2 root root 4096 фев 27 10:55 /etc/cron.daily/
drwxr-xr-x 2 root root 4096 дек  7 23:04 /etc/cron.hourly/
drwxr-xr-x 2 root root 4096 дек  7 23:04 /etc/cron.monthly/
drwxr-xr-x 2 root root 4096 дек  7 23:04 /etc/cron.weekly/


debian:~# cat /etc/crontab 
# /etc/crontab: system-wide crontab
# Unlike any other crontab you don't have to run the `crontab'
# command to install the new version when you edit this file
# and files in /etc/cron.d. These files also have username fields,
# that none of the other crontabs do.

PATH=/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin# Example of job definition:
# .—————- minute (0 – 59)
# |  .————- hour (0 – 23)
# |  |  .———- day of month (1 – 31)
# |  |  |  .——- month (1 – 12) OR jan,feb,mar,apr …
# |  |  |  |  .—- day of week (0 – 6) (Sunday=0 or 7) OR sun,mon,tue,wed,thu,fri,sat
# |  |  |  |  |
# *  *  *  *  * user-name command to be executed
17 *    * * *    root    cd / && run-parts –report /etc/cron.hourly
25 6    * * *    root    test -x /usr/sbin/anacron || ( cd / && run-parts –report /etc/cron.daily )
47 6    * * 7    root    test -x /usr/sbin/anacron || ( cd / && run-parts –report /etc/cron.weekly )
52 6    1 * *    root    test -x /usr/sbin/anacron || ( cd / && run-parts –report /etc/cron.monthly )

What above does is:

– Run cron.hourly once at every hour at 1:17 am
– Run cron.daily once at every day at 6:25 am.
– Run cron.weekly once at every day at 6:47 am.
– Run cron.monthly once at every day at 6:42 am.

As you can see if anacron is present on the system it is run via it otherwise it is run via run-parts binary command which is reading and executing one by one all scripts insude /etc/cron.hourly, /etc/cron.weekly , /etc/cron.mothly

anacron – few more words

Anacron is the canonical way to run at least the jobs from /etc/cron.{daily,weekly,monthly) after startup, even when their execution was missed because the system was not running at the given time. Anacron does not handle any cron jobs from /etc/cron.d, so any package that wants its /etc/cron.d cronjob being executed by anacron needs to take special measures.

If anacron is installed, regular processing of the /etc/cron.d{daily,weekly,monthly} is omitted by code in /etc/crontab but handled by anacron via /etc/anacrontab. Anacron's execution of these job lists has changed multiple times in the past:

debian:~# cat /etc/anacrontab 
# /etc/anacrontab: configuration file for anacron

# See anacron(8) and anacrontab(5) for details.


# These replace cron's entries
1    5    cron.daily    run-parts –report /etc/cron.daily
7    10    cron.weekly    run-parts –report /etc/cron.weekly
@monthly    15    cron.monthly    run-parts –report /etc/cron.monthly

In wheezy and earlier, anacron is executed via init script on startup and via /etc/cron.d at 07:30. This causes the jobs to be run in order, if scheduled, beginning at 07:35. If the system is rebooted between midnight and 07:35, the jobs run after five minutes of uptime.
In stretch, anacron is executed via a systemd timer every hour, including the night hours. This causes the jobs to be run in order, if scheduled, beween midnight and 01:00, which is a significant change to the previous behavior.
In buster, anacron is executed via a systemd timer every hour with the exception of midnight to 07:00 where anacron is not invoked. This brings back a bit of the old timing, with the jobs to be run in order, if scheduled, beween 07:00 and 08:00. Since anacron is also invoked once at system startup, a reboot between midnight and 08:00 also causes the jobs to be scheduled after five minutes of uptime.
anacron also didn't have an upstream release in nearly two decades and is also currently orphaned in Debian.

As of 2019-07 (right after buster's release) it is planned to have cron and anacron replaced by cronie.

cronie – Cronie was forked by Red Hat from ISC Cron 4.1 in 2007, is the default cron implementation in Fedora and Red Hat Enterprise Linux at least since Version 6. cronie seems to have an acive upstream, but is currently missing some of the things that Debian has added to vixie cron over the years. With the finishing of cron's conversion to quilt (3.0), effort can begin to add the Debian extensions to Vixie cron to cronie.

Because cronie doesn't have all the Debian extensions yet, it is not yet suitable as a cron replacement, so it is not in Debian.

5. systemd-timers – The new crazy systemd stuff for script system job scheduling

Timers are systemd unit files with a suffix of .timer. systemd-timers was introduced with systemd so older Linux OS-es does not have it.
 Timers are like other unit configuration files and are loaded from the same paths but include a [Timer] section which defines when and how the timer activates. Timers are defined as one of two types:


  • Realtime timers (a.k.a. wallclock timers) activate on a calendar event, the same way that cronjobs do. The option OnCalendar= is used to define them.
  • Monotonic timers activate after a time span relative to a varying starting point. They stop if the computer is temporarily suspended or shut down. There are number of different monotonic timers but all have the form: OnTypeSec=. Common monotonic timers include OnBootSec and OnActiveSec.



    For each .timer file, a matching .service file exists (e.g. foo.timer and foo.service). The .timer file activates and controls the .service file. The .service does not require an [Install] section as it is the timer units that are enabled. If necessary, it is possible to control a differently-named unit using the Unit= option in the timer’s [Timer] section.

    systemd-timers is a complex stuff and I'll not get into much details but the idea was to give awareness of its existence for more info check its manual man systemd.timer

Its most basic use is to list all configured systemd.timers, below is from my home Debian laptop

debian:~# systemctl list-timers –all
NEXT                         LEFT         LAST                         PASSED       UNIT                         ACTIVATES
Tue 2020-03-24 23:33:58 EET  18s left     Tue 2020-03-24 23:31:28 EET  2min 11s ago laptop-mode.timer            lmt-poll.service
Tue 2020-03-24 23:39:00 EET  5min left    Tue 2020-03-24 23:09:01 EET  24min ago    phpsessionclean.timer        phpsessionclean.service
Wed 2020-03-25 00:00:00 EET  26min left   Tue 2020-03-24 00:00:01 EET  23h ago      logrotate.timer              logrotate.service
Wed 2020-03-25 00:00:00 EET  26min left   Tue 2020-03-24 00:00:01 EET  23h ago      man-db.timer                 man-db.service
Wed 2020-03-25 02:38:42 EET  3h 5min left Tue 2020-03-24 13:02:01 EET  10h ago      apt-daily.timer              apt-daily.service
Wed 2020-03-25 06:13:02 EET  6h left      Tue 2020-03-24 08:48:20 EET  14h ago      apt-daily-upgrade.timer      apt-daily-upgrade.service
Wed 2020-03-25 07:31:57 EET  7h left      Tue 2020-03-24 23:30:28 EET  3min 11s ago anacron.timer                anacron.service
Wed 2020-03-25 17:56:01 EET  18h left     Tue 2020-03-24 17:56:01 EET  5h 37min ago systemd-tmpfiles-clean.timer systemd-tmpfiles-clean.service


8 timers listed.

N ! B! If a timer gets out of sync, it may help to delete its stamp-* file in /var/lib/systemd/timers (or ~/.local/share/systemd/ in case of user timers). These are zero length files which mark the last time each timer was run. If deleted, they will be reconstructed on the next start of their timer.


In this article, I've shortly explain logic behind debugging weird restart events etc. of Linux configured services such as Apache due to configured scripts set to run with a predefined scheduled job timing. I shortly explained on how to figure out why the preset default install configured cron jobs such as logrorate – the service that is doing system logs archiving and nulling run at a certain time. I shortly explained the mechanism behind cron.{daily, monthy, weekly} and its execution via anacron – runner program similar to crond that never misses to run a scheduled job even if a system downtime occurs due to a crashed Docker container etc. run-parts command's use was shortly explained. A short look at systemd.timers was made which is now essential part of almost every new Linux release and often used by system scripts for scheduling time based maintainance tasks.

Check if server is Physical Bare Metal or a Virtual Machine and its type

Tuesday, March 17th, 2020


In modern times the IT employee system administrator / system engineer / security engineer or a developer who has to develop and test code remotely on UNIX hosts, we have to login to multiple of different servers located in separate data centers around the world situated in Hybrid Operating system environments running multitude of different Linux OSes. Often especially for us sysadmins it is important to know whether the remote machine we have SSHed to is physical server (Bare Metal) or a virtual machines running on top of different kind of Hypervisor node OpenXen / Virtualbox / Virtuosso  / VMWare etc.

Then the question comes how to determine whether A remote Installed Linux is Physical or Virtual ?

1. Using the dmesg kernel log utility

The good old dmesg that is used to examine and control the kernel ring buffer detects plenty of useful information which gives you the info whether a server is Virtual or Bare Metal. It is present and accessible on every Linux server out there, thus using it is the best and simplest way to determine the OS system node type.

To grep whether a machine is Virtual and the Hypervisor type use:


nginx:~# dmesg | grep "Hypervisor detected"
[0.000000] Hypervisor detected: KVM

As you see above OS installed is using the KVM Virtualization technology.

An empty output of this command means the Remote OS is installed on a physical computer.


2. Detecting the OS platform the systemd way

Systemd along with the multiple over-complication of things that nearly all sysadmins (including me hate) so much introduced something useful in the fact of hostnamectl command
that could give you the info about the OS chassis platform.


root@pcfreak:~# hostnamectl status
 Static hostname: pcfreak
         Icon name: computer-desktop
           Chassis: desktop
        Machine ID: 02425d67037b8e67cd98bd2800002671
           Boot ID: 34a83b9a79c346168082f7605c2f557c
  Operating System: Debian GNU/Linux 10 (buster)
            Kernel: Linux 4.19.0-5-amd64
      Architecture: x86-64



Below is output of a VM running on a Oracle Virtualbox HV.


linux:~# hostnamectl status
Static hostname: ubuntuserver
 Icon name: computer-vm
 Chassis: vm
 Machine ID: 2befe86cf8887ca098f509e457554beb
 Boot ID: 8021c02d65dc46b1885afb25fddcf18c
 Virtualization: oracle
 Operating System: Ubuntu 16.04.1 LTS
 Kernel: Linux 4.4.0-78-generic
 Architecture: x86-64


3. Detect concrete container virtualization with systemd-detect-virt 

Another Bare Metal or VM identify tool that was introducted some time ago by freedesktop project is systemd-detect-virt (usually command is part of systemd package).
It is useful to detect the exact virtualization on a systemd running OS systemd-detect-virt is capable to detect many type of Virtualization type that are rare like: IBM zvm S390 Z/VM, bochs, bhyve (a FreeBSD hypervisor), Mac OS's parallels, lxc (linux containers), docker containers, podman etc.

The output from the command is either none (if no virtualization is present or the VM Hypervisor Host type):


server:~# systemd-detect-virt


quake:~# systemd-detect-virt


4. Install and use facter to report per node facts


debian:~# apt-cache show facter|grep -i desc -A2
Description-en: collect and display facts about the system
 Facter is Puppet’s cross-platform system profiling library. It discovers and
 reports per-node facts, which are collected by the Puppet agent and are made

Description-md5: 88cdf9a1db3df211de4539a0570abd0a
Tag: devel::lang:ruby, devel::library, implemented-in::ruby,
root@jeremiah:/home/hipo# apt-cache show facter|grep -i desc -A1
Description-en: collect and display facts about the system
 Facter is Puppet’s cross-platform system profiling library. It discovers and

Description-md5: 88cdf9a1db3df211de4539a0570abd0a


– Install facter on Debian / Ubuntu / deb based Linux


# apt install facter –yes

– Install facter on RedHat / CentOS RPM based distros

# yum install epel-release


# yum install facter

– Install facter on OpenSuSE / SLES

# zypper install facter

Once installed on the system to find out whether the remote Operating System is Virtual:

# facter 2> /dev/null | grep virtual
is_virtual => false
virtual => physical

If the machine is a virtual machine you will get some different output like:

# facter 2> /dev/null | grep virtual
is_virtual => true
virtual => kvm

If you're lazy to grep you can use it with argument.

# facter virtual


6. Use lshw and dmidecode (list hardware configuration tool)

If you don't have the permissions to install facter on the system and you can see whether lshw (list hardware command) is not already present on remote host.

# lshw -class system  
    description: Computer
    width: 64 bits
    capabilities: smbios-2.7 vsyscall32

If the system is virtual you'll get an output similar to:

# lshw -class system  
 description: Computer
 product: VirtualBox
 vendor: innotek GmbH
 version: 1.2
 serial: 0
 width: 64 bits
 capabilities: smbios-2.5 dmi-2.5 vsyscall32
 configuration: family=Virtual Machine uuid=78B58916-4074-42E2-860F-7CAF39F5E6F5

Of course as it provides a verbosity of info on Memory / CPU type / Caches / Cores / Motherboard etc. virtualization used or not can be determined also with dmidecode / hwinfo and other tools that detect the system hardware this is described thoroughfully in my  previous article Get hardware system info on Linux.

7. Detect virtualziation using virt-what or imvirt scripts

imvirt is a little script to determine several virtualization it is pretty similar to virt-what the RedHat own script for platform identification. Even though virt-what is developed for RHEL it is available on other distros, Fedoda, Debian, Ubuntu, Arch Linux (AUR) just like is imvirt.

installing both of them is with the usual apt-get / yum or on Arch Linux with yay package manager (yay -S virt-what) …

Once run the output it produces for physical Dell / HPE / Fujitsu-Siemens Bare Metal servers would be just empty string.

# virt-what

Or if the system is Virtual Machine, you'll get the type, for example KVM (Kernel-based Virtual Machine) / virtualbox / qemu etc.




It was explained how to do a simple check whether the server works on a physical hardware or on a virtual Host hypervisor. The most basic and classic way is with dmesg. If no access to dmesg is due to restrictions you can try the other methods for systemd enabled OSes with hostnamectl / systemd-detect-virt. Other means if the tools are installed or you have the permissions to install them is with facter / lshw or with virt-what / imvirt scripts.
There definitely perhaps much more other useful tools to grasp hardware and virtualization information but this basics could be useful enough for shell scripting purposes.
If you know other tools, please share.

Linux: Compress website images for better responsiveness with Trimage Graphical tool

Tuesday, March 10th, 2020


If you run a Website or a Blog with images sooner or later you will end up with in looking for better ways to optimize the SEO of the website. I had a small discussion today with a friend of mine Mitko Ivanov who is working as SEO consultant expert,  we had a small discussion on the good practice of optimizing website pictures to reduce the website opening time. Ingeral part of Website responsiveness is the time the Browser needs to fetch all the page Images. Thus if your site is with multiple images, like this blog here, picture comperssion is definitely something that could make miracles in how website visualize for end user and increase rank in Search Engines. The easiest way to compress images of an amateur website of course is to use external picture compression service such as, this requires no knowledge at any computer technology and you can do it easy, but the problem is it shares your image to the remote website used for conversion and I personally think this is not the best idea.
For WordPress website owners of course there is plenty of plugins such as eWWW Image Optimizer that does realtime reduce of size of picture by chunking out the unnecessery bits.
Alternative to especially for people who have a little bit of technical knowledge is is to use some command line tool as optipng together with some kind of shell for loopfor details see my previous article Optimize PNG images by compressing on GNU / Linux, FreeBSD server to Improve Website overall Performance.
But for Many of Webmaster site owners this solution takes too much time as well many people just don't have even basic command line knowledge / are kinda of scared from the console but need to do image compression in a simple GUI way for those the good news are there is  Graphical cross-platform tool for losslessly optimizing PNG and JPG files for web. Trimage.
To use it it even unexperienced non enthusiast could simply roll out a new Virtual Machine on top of some VM Host machine such as Virtual Box and roll out some kind of Linux distribution via a graphical installer which is mega easy well guided and takes 15-20 minutes time.

Once machine is set-up either the Graphical Distribution tool for page management or via apt you can fetch Trimage. It is now existing in most Linux distributions so, to install it on any deb based distribution Debian / Mint / Ubuntu etc. do the usual:

# apt-get install –yes trimage


Once you have it, just move the pictures you want to compress for losslessly optimizing from your website to your Computer with Linux. Trimage GUI on the background will run commands optipng, pngcrush, advpng or jpegoptim, imageoptim and depending on the filetype remove the unnecessery file data that are appended by the program with which image was produced Gimp / Photoshop / Camera software etc. All image files are losslessy compressed on the highest available compression levels, and EXIF and other metadata is removed so you just have to recopy ( upload ) the optimized images back to the website.


That's all folks Enjoy ! 🙂


IBM TSM dsmc console client use for listing configured backups, checking set scheduled backups and backup and restore operations howto

Friday, March 6th, 2020


Creating a simple home based backup solution with some shell scripting and rsync is a common use. However as a sysadmin in a middle sized or large corporations most companies use some professional backup service such as IBM Tivoli Storage Manager TSM – recently IBM changed the name of the product to IBM Spectrum.

IBM TSM  is a data protection platform that gives enterprises a single point of control and administration for backup and recovery that is used for Privare Clouds backup and other high end solutions where data criticality is top.
Usually in large companies TSM backup handling is managed by a separate team or teams as managing a large TSM infrastructure is quite a complex task, however my experience as a sysadmin show me that even if you don't have too much of indepth into tsm it is very useful to know how to manage at least basic Incremental backup operations such as view what is set to be backupped, set-up a new directory structure for backup, check the backup schedule configured, check what files are included and which excluded from the backup store etc. 

TSM has multi OS support ans you can use it on most streamline Operating systems Windows / Mac OS X and Linux in this specific article I'll be talking concretely about backing up data with tsm on Linux, tivoli can be theoretically brought up even on FreeBSD machines via the Linuxemu BSD module and the 64-Bit Tivoli Storage Manager RPMs.
Therefore in this small article I'll try to give few useful operations for the novice admin that stumbles on tsm backupped server that needs some small maintenance.

1. Starting up the dsmc command line client


Nomatter the operating system on which you run it to run the client run:

# dsmc



Note that usually dsmc should run as superuser so if you try to run it via a normal non-root user you will get an error message like:


[ user@linux ~]$ dsmc
ANS1398E Initialization functions cannot open one of the Tivoli Storage Manager logs or a related file: /var/tsm/dsmerror.log. errno = 13, Permission denied


Tivoli SM has an extensive help so to get the use basics, type help

tsm> help
1.0 New for IBM Tivoli Storage Manager Version 6.4
2.0 Using commands
  2.1 Start and end a client command session
    2.1.1 Process commands in batch mode
    2.1.2 Process commands in interactive mode
  2.2 Enter client command names, options, and parameters
    2.2.1 Command name
    2.2.2 Options
    2.2.3 Parameters
    2.2.4 File specification syntax
  2.3 Wildcard characters
  2.4 Client commands reference
  2.5 Archive
  2.6 Archive FastBack

Enter 'q' to exit help, 't' to display the table of contents,
press enter or 'd' to scroll down, 'u' to scroll up or
enter a help topic section number, message number, option name,
command name, or command and subcommand:    


2. Listing files listed for backups


A note to make here is as in most corporate products tsm supports command aliases so any command supported described in the help like query, could be
abbreviated with its first letters only, e.g. query filespace tsm cmd can be abbreviated as

tsm> q fi

Commands can be run non-interactive mode also so if you want the output of q fi you can straight use:

tsm> dsmc q fi



This shows the directories and files that are set for backup creation with Tivoli.


3. Getting included and excluded backup set files


It is useful to know what are the exact excluded files from tsm set backup this is done with query inclexcl



4. Querying for backup schedule time

Tivoli as every other backup solution is creating its set to backup files in a certain time slot periods. 
To find out what is the time slot for backup creation use;

tsm> q sched
Schedule Name: WEEKLY_ITSERV
      Description: ITSERV weekly incremental backup
   Schedule Style: Classic
           Action: Incremental
         Priority: 5
   Next Execution: 180 Hours and 35 Minutes
         Duration: 15 Minutes
           Period: 1 Week  
      Day of Week: Wednesday
     Day of Month:
    Week of Month:
           Expire: Never  




5. Check which files have been backed up

If you want to make sure backups are really created it is a good to check, which files from the selected backup files have already
a working backup copy.

This is done with query backup like so:

tsm> q ba /home/*



If you want to query all the current files and directories backed up under a directory and all its subdirectories you need to add the -subdir=yes option as below:


tsm> q ba /home/hipo/projects/* -subdir=yes
Size      Backup Date        Mgmt Class A/I File
   —-      ———–        ———- — —-
    512  12-09-2011 19:57:09    STANDARD    A  /home/hipo/projects/hfs0106
  1,024  08-12-2011 02:46:53    STANDARD    A  /home/hipo/projects/hsm41perf
    512  12-09-2011 19:57:09    STANDARD    A  /home/hipo/projects/hsm41test
    512  24-04-2012 00:22:56    STANDARD    A  /home/hipo/projects/hsm42upg
  1,024  12-09-2011 19:57:09    STANDARD    A  /home/hipo/projects/hfs0106/test
  1,024  12-09-2011 19:57:09    STANDARD    A  /home/hipo/projects/hfs0106/test/test2
 12,048  04-12-2011 02:01:29    STANDARD    A  /home/hipo/projects/hsm41perf/tables
 50,326  30-04-2012 01:35:26    STANDARD    A  /home/hipo/projects/hsm42upg/PMR70023
 50,326  27-04-2012 00:28:15    STANDARD    A  /home/hipo/projects/hsm42upg/PMR70099
 11,013  24-04-2012 00:22:56    STANDARD    A  /home/hipo/projects/hsm42upg/md5check  


  • To make tsm, backup some directories on Linux / AIX other unices:


tsm> incr /  /usr  /usr/local  /home /lib


  • For tsm to backup some standard netware drives, use:


tsm> incr NDS:  USR:  SYS:  APPS:  


  • To backup C:\ D:\ E:\ F:\ if TSM is running on Windows


tsm> incr C:  D:  E: F:  -incrbydate 


  • To back up entire disk volumes irrespective of whether files have changed since the last backup, use the selective command with a wildcard and -subdir=yes as below:


tsm> sel /*  /usr/*   /home/*  -su=yes   ** Unix/Linux


7. Backup selected files from a backup location


It is intuitive to think you can just add some wildcard characters to select what you want
to backup from a selected location but this is not so, if you try something like below
you will get an err.


tsm> incr /home/hipo/projects/*/* -su=yes      
ANS1071E Invalid domain name entered: '/home/hipo/projects/*/*'

The proper way to select a certain folder / file for backup is with:


tsm> sel /home/hipo/projects/*/* -su=yes


8. Restoring tsm data from backup


To restore the config httpd.conf to custom directory use:


tsm> rest /etc/httpd/conf/httpd.conf  /home/hipo/restore/


N!B! that in order for above to work you need to have the '/' trailing slash at the end.

If you want to restore a file under a different name:


tsm> rest /etc/ntpd.conf  /home/hipo/restore/


9. Restoring a whole backupped partition


tsm> rest /home/*  /tmp/restore/ -su=yes


This is using the Tivoli 'Restoring multiple files and directories', and the files to restore '*'
are kept till the one that was recovered (saying this in case if you accidently cancel the restore)


10. Restoring files with back date 


By default the restore function will restore the latest available backupped file, if you need
to recover a specific file, you need the '-inactive' '-pick' options.
The 'pick' interface is interactive so once listed you can select the exact file from the date
you want to restore.

General restore command syntax is:

tsm> restore [source-file] [destination-file]


tsm> rest /home/hipo/projects/*  /tmp/restore/ -su=yes  -inactive -pick

TSM Scrollable PICK Window – Restore

     #    Backup Date/Time        File Size A/I  File
   170. | 12-09-2011 19:57:09        650  B  A   /home/hipo/projects/hsm41test/inclexcl.test
   171. | 12-09-2011 19:57:09       2.74 KB  A   /home/hipo/projects/hsm41test/inittab.ORIG
   172. | 12-09-2011 19:57:09       2.74 KB  A   /home/hipo/projects/hsm41test/inittab.TEST
   173. | 12-09-2011 19:57:09       1.13 KB  A   /home/hipo/projects/hsm41test/md5.out
   174. | 30-04-2012 01:35:26        512  B  A   /home/hipo/projects/hsm42125upg/PMR70023
   175. | 26-04-2012 01:02:08        512  B  I   /home/hipo/projects/hsm42125upg/PMR70023
   176. | 27-04-2012 00:28:15        512  B  A   /home/hipo/projects/hsm42125upg/PMR70099
   177. | 24-04-2012 19:17:34        512  B  I   /home/hipo/projects/hsm42125upg/PMR70099
   178. | 24-04-2012 00:22:56       1.35 KB  A   /home/hipo/projects/hsm42125upg/dsm.opt
   179. | 24-04-2012 00:22:56       4.17 KB  A   /home/hipo/projects/hsm42125upg/dsm.sys
   180. | 24-04-2012 00:22:56       1.13 KB  A   /home/hipo/projects/hsm42125upg/dsmmigfstab
   181. | 24-04-2012 00:22:56       7.30 KB  A   /home/hipo/projects/hsm42125upg/filesystems
   182. | 24-04-2012 00:22:56       1.25 KB  A   /home/hipo/projects/hsm42125upg/inclexcl
   183. | 24-04-2012 00:22:56        198  B  A   /home/hipo/projects/hsm42125upg/inclexcl.dce
   184. | 24-04-2012 00:22:56        291  B  A   /home/hipo/projects/hsm42125upg/inclexcl.ox_sys
   185. | 24-04-2012 00:22:56        650  B  A   /home/hipo/projects/hsm42125upg/inclexcl.test
   186. | 24-04-2012 00:22:56        670  B  A   /home/hipo/projects/hsm42125upg/inetd.conf
   187. | 24-04-2012 00:22:56       2.71 KB  A   /home/hipo/projects/hsm42125upg/inittab
   188. | 24-04-2012 00:22:56       1.00 KB  A   /home/hipo/projects/hsm42125upg/md5check
   189. | 24-04-2012 00:22:56      79.23 KB  A   /home/hipo/projects/hsm42125upg/mkreport.020423.out
   190. | 24-04-2012 00:22:56       4.27 KB  A   /home/hipo/projects/hsm42125upg/ssamap.020423.out
   191. | 26-04-2012 01:02:08      12.78 MB  A   /home/hipo/projects/hsm42125upg/PMR70023/70023.tar
   192. | 25-04-2012 16:33:36      12.78 MB  I   /home/hipo/projects/hsm42125upg/PMR70023/70023.tar
<U>=Up  <D>=Down  <T>=Top  <B>=Bottom  <R#>=Right  <L#>=Left
<G#>=Goto Line #  <#>=Toggle Entry  <+>=Select All  <->=Deselect All
<#:#+>=Select A Range <#:#->=Deselect A Range  <O>=Ok  <C>=Cancel

To navigate in pick interface you can select individual files to restore via the number seen leftside.
To scroll up / down use 'U' and 'D' as described in the legenda.


11. Restoring your data to another machine


In certain circumstances, it may be necessary to restore some, or all, of your data onto a machine other than the original from which it was backed up.

In ideal case the machine platform should be identical to that of the original machine. Where this is not possible or practical please note that restores are only possible for partition types that the operating system supports. Thus a restore of an NTFS partition to a Windows 9x machine with just FAT support may succeed but the file permissions will be lost.
TSM does not work fine with cross-platform backup / restore, so better do not try cross-platform restores.
 Trying to restore files onto a Windows machine that have previously been backed up with a non-Windows one. TSM created backups on Windows sent by other OS platforms can cause  backups to become inaccessible from the host system.

To restore your data to another machine you will need the TSM software installed on the target machine. Entries in Tivoli configuration files dsm.sys and/or dsm.opt need to be edited if the node that you are restoring from does not reside on the same server. Please see our help page section on TSM configuration files for their locations for your operating system. 

To access files from another machine you should then start the TSM client as below:


# dsmc -virtualnodename=RESTORE.MACHINE      

You will then be prompted for the TSM password for this machine.


You will probably want to restore to a different destination to the original files to prevent overwriting files on the local machine, as below:


  • Restore of D:\ Drive to D:\Restore ** Windows 


tsm> rest D:\*   D:\RESTORE\    -su=yes 


  • Restore user /home/* to /scratch on ** Mac, Unix/Linux


tsm> rest /home/* /scratch/     -su=yes  


  • Restoring Tivoli data on old netware


tsm> rest SOURCE-SERVER\USR:*  USR:restore/   -su=yes  ** Netware


12. Adding more directories for incremental backup / Check whether TSM backup was done correctly?

The easiest way is to check the produced dschmed.log if everything is okay there should be records in the log that Tivoli backup was scheduled in a some hours time
A normally produced backup scheduled in log should look something like:


14-03-2020 23:03:04 — SCHEDULEREC STATUS BEGIN
14-03-2020 23:03:04 Total number of objects inspected:   91,497
14-03-2020 23:03:04 Total number of objects backed up:      113
14-03-2020 23:03:04 Total number of objects updated:          0
14-03-2020 23:03:04 Total number of objects rebound:          0
14-03-2020 23:03:04 Total number of objects deleted:          0
14-03-2020 23:03:04 Total number of objects expired:         53
14-03-2020 23:03:04 Total number of objects failed:           6
14-03-2020 23:03:04 Total number of bytes transferred:    19.38 MB
14-03-2020 23:03:04 Data transfer time:                    1.54 sec
14-03-2020 23:03:04 Network data transfer rate:        12,821.52 KB/sec
14-03-2020 23:03:04 Aggregate data transfer rate:        114.39 KB/sec
14-03-2020 23:03:04 Objects compressed by:                    0%
14-03-2020 23:03:04 Elapsed processing time:           00:02:53
14-03-2020 23:03:04 — SCHEDULEREC STATUS END
14-03-2020 23:03:04 — SCHEDULEREC OBJECT END WEEKLY_23_00 14-12-2010 23:00:00
14-03-2020 23:03:04 Scheduled event 'WEEKLY_23_00' completed successfully.
14-03-2020 23:03:04 Sending results for scheduled event 'WEEKLY_23_00'.
14-03-2020 23:03:04 Results sent to server for scheduled event 'WEEKLY_23_00'.


in case of errors you should check dsmerror.log


In this article I've briefly evaluated some basics of IBM Commercial Tivoli Storage Manager (TSM) to be able to  list backups, check backup schedules and how to the files set to be
excluded from a backup location and most importantly how to check that data backed up data is in a good shape and accessible.
It was explained how backups can be restored on a local and remote machine as well as how to  append new files to be set for backup on next incremental scheduled backup.
It was shown how the pick interactive cli interface could be used to restore files at a certain data back in time as well as how full partitions can be restored and how some
certain file could be retrieved from the TSM data copy.

Improve DNS lookup domain resolve speed on Linux / UNIX servers through /etc/resolv.conf timeout, attempts, rorate options

Thursday, February 27th, 2020

If you're an performance optimization freak and you want to optimize your Linux servers to perform better in terms of DNS resolve slowness because of failing DNS resolve queries due to Domain Name Server request overload or due to Denial of Service attack towards it. It might be interesting to mention about some little known functionalities of /etc/resolv.conf described in the manual page.

The defined nameservers under /etc/resolv.conf are queried one by one waiting for responce of the sent DNS resolve request if it is not replied from the first one for some time, the 2nd one is queried until a responce is received by any of the defined nameserver IPs

A default /etc/resolv.conf on a new Linux server install looks something like this:


However one thing is that defined if NS1 dies out due to anything, it takes timeout time until the second or 3rd working one takes over to resolve the query.
This is controlled by the timeout value.

Below is description from man page

sets the amount of time the resolver will wait for a
response from a remote name server before retrying the
query via a different name server.  Measured in
seconds, the default is RES_TIMEOUT (currently 5, see
<resolv.h>).  The value for this option is silently
capped to 30.


  • In other words Timeout value is time to resolving IP address from hostname through DNS server,timeout option is to reduce hostname lookup time

As you see from manual default is 5 seconds which is quite high, thus reducing the value to 3 secs or even 1 seconds is a good sysadmin practice IMHO.

Another value that could be tuned in /etc/resolv.conf is attempts value below is what the manual says about it: 

                     Sets the number of times the resolver will send a query to its name servers before giving up and returning an error to the calling application.  The default is RES_DFLRETRY (cur‐
                     rently 2, see <resolv.h>).  The value for this option is silently capped to 5.



  • This means default behaviour on a failing DNS query resolve is to try to resend the DNS resolve request to the failing nameserver 5 more times, that is quite high thus it is a good practice from my experience to reduce it to something as 2 or 1

Another very useful resolv.conf value is rotate
The default behavior of how DNS outgoing Domain requests are handled is to use only the primary defined DNS, instead if you need to do a load balancing in a round-robin manner add to conf rotate option.

The final /etc/resolv.conf optimized would look like so:


linux# cat /etc/resolv.conf

options ndots:1
options timeout:1
options attempts:1
options rotate

The search opt. placement is also important to be placed in the right location in the file. The correct placement is after the nameservers defined, I have to say in older Linux distributions the correct placement of search option was to be on top of resolv.conf.

Note that this configuration is good and fits not only Linux but also is a good DNS lookup optimization speed on other UNIX derivatives such as FreeBSD / NetBSD as well as other Proprietary OS UNIX machines running IBM AIX etc.

On Linux it is also possible to place the options given in one single line like so, below is the config I have on my running Lenovo server:


options timeout:2 attempts:1 rotate


When is /etc/hosts record venerated and when is /etc/resolv.conf DNS defined queried for a defined DNS host?


One important thing to know when dealing with /etc/resolv.conf  is what happens if a Name domain is defined in both /etc/hosts and /etc/resolv.conf.
For example you have a domain record in /etc/hosts to a certain domain
but the DNS nameserver in Google has a record to an IP that is the real IP


# dig @

; <<>> DiG 9.11.5-P4-5.1-Debian <<>> @
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 54656
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1

; EDNS: version: 0, flags:; udp: 512
;                  IN      A

;; ANSWER SECTION:           3599    IN      A

;; Query time: 40 msec
;; WHEN: чт фев 27 18:04:23 EET 2020
;; MSG SIZE  rcvd: 57


  • Which of the 2 different IPs will the applications installed on the server such as Apache / Squid / MySQL / tinyproxy for their DNS resolve operations?


Now it is time to say few words about /etc/nsswitch.conf (The Nameserver switching configuration file). This file defines the DNS resolve file used order in which the Operationg System does IP to domain translation and backwards.

# grep -i hosts: /etc/nsswitch.conf

hosts:          files dns myhostname

As you can see first the local defined in files like /etc/hosts record is venerated when resolving, then it is the externally configured DNS resolver IPs from /etc/resolv.conf.

nsswitch.conf  is used also for defining where the OS will look up for user / passwd (e.g. login credentials) on login, on systems which are having an LDAP authentication via the sssd (system security services daemon) via definitions like:


passwd:     files sss
shadow:     files sss
group:      files sss

E.g. the user login will be first try to read from local /etc/passwd , /etc/shadow , /etc/groups and if no matched record is found then the LDAP service the sssd is queried.

Procedure Instructions to safe upgrade CentOS / RHEL Linux 7 Core to latest release

Thursday, February 13th, 2020


Generally upgrading both RHEL and CentOS can be done straight with yum tool just we're pretty aware and mostly anyone could do the update, but it is good idea to do some
steps in advance to make backup of any old basic files that might help us to debug what is wrong in case if the Operating System fails to boot after the routine Machine OS restart
after the upgrade that is usually a good idea to make sure that machine is still bootable after the upgrade.

This procedure can be shortened or maybe extended depending on the needs of the custom case but the general framework should be useful anyways to someone that's why
I decided to post this.

Before you go lets prepare a small status script which we'll use to report status of  sysctl installed and enabled services as well as the netstat connections state and
configured IP addresses and routing on the system.

The script to be used during our different upgrade stages:

# script status ###
echo "STARTED: $(date '+%Y-%m-%d_%H-%M-%S'):" | tee /root/logs/yumcheckupdate-$(hostname)-$(date '+%Y-%m-%d_%H-%M-%S').out
systemctl list-unit-files –type=service | grep enabled
systemctl | grep ".service" | grep "running"
netstat -tulpn
netstat -r
ip a s
/sbin/route -n
echo "ENDED $(date '+%Y-%m-%d_%H-%M-%S'):" | tee /root/logs/yumcheckupdate-$(hostname)-$(date '+%Y-%m-%d_%H-%M-%S').out


– Save the script in any file like /root/

– Make the /root/logs directoriy.

[root@redhat: ~ ]# mkdir /root/logs
[root@redhat: ~ ]# vim /root/
[root@redhat: ~ ]# chmod +x /root/


1. Get a dump of CentOS installed version release and grub-mkconfig generated os_probe


[root@redhat: ~ ]# cat /etc/redhat-release  > /root/logs/redhat-release-vorher-$(hostname)-$(date '+%Y-%m-%d_%H-%M-%S').out
[root@redhat: ~ ]# cat /etc/grub.d/30_os-prober > /root/logs/grub2-efi-vorher-$(hostname)-$(date '+%Y-%m-%d_%H-%M-%S').out


2. Clear old versionlock marked RPM packages (if there are such)


On servers maintained by multitude of system administrators just like the case is inside a Global Corporations and generally in the corporate world , where people do access the systems via LDAP and more than a single person
has superuser privileges. It is a good prevention measure to use yum package management  functionality to RPM based Linux distributions called  versionlock.
versionlock for those who hear it for a first time is locking the versions of the installed RPM packages so if someone by mistake or on purpose decides to do something like :

[root@redhat: ~ ]# yum install packageversion

Having the versionlock set will prevent the updated package to be installed with a different branch package version.

Also it will prevent a playful unknowing person who just wants to upgrade the system without any deep knowledge to be able to

[root@redhat: ~ ]# yum upgrade

update and leave the system in unbootable state, that will be only revealed during the next system reboot.

If you haven't used versionlock before and you want to use it you can do it with:

[root@redhat: ~ ]# yum install yum-plugin-versionlock

To add all the packages for compiling C code and all the interdependend packages, you can do something like:


[root@redhat: ~ ]# yum versionlock gcc-*

If you want to clear up the versionlock, once it is in use run:

[root@redhat: ~ ]#  yum versionlock clear
[root@redhat: ~ ]#  yum versionlock list


3.  Check RPC enabled / disabled


This step is not necessery but it is a good idea to check whether it running on the system, because sometimes after upgrade rpcbind gets automatically started after package upgrade and reboot. 
If we find it running we'll need to stop and mask the service.


# check if rpc enabled
[root@redhat: ~ ]# systemctl list-unit-files|grep -i rpc
var-lib-nfs-rpc_pipefs.mount                                      static
auth-rpcgss-module.service                                        static
rpc-gssd.service                                                  static
rpc-rquotad.service                                               disabled
rpc-statd-notify.service                                          static
rpc-statd.service                                                 static
rpcbind.service                                                   disabled
rpcgssd.service                                                   static
rpcidmapd.service                                                 static
rpcbind.socket                                                    disabled                                                 static                                                    static

[root@redhat: ~ ]# systemctl status rpcbind.service
● rpcbind.service – RPC bind service
   Loaded: loaded (/usr/lib/systemd/system/rpcbind.service; disabled; vendor preset: enabled)
   Active: inactive (dead)


[root@redhat: ~ ]# systemctl status rpcbind.socket
● rpcbind.socket – RPCbind Server Activation Socket
   Loaded: loaded (/usr/lib/systemd/system/rpcbind.socket; disabled; vendor preset: enabled)
   Active: inactive (dead)
   Listen: /var/run/rpcbind.sock (Stream)
           [::]:111 (Stream)
           [::]:111 (Datagram)


4. Check any previously existing downloaded / installed RPMs (check yum cache)


yum install package-name / yum upgrade keeps downloaded packages via its operations inside its cache directory structures in /var/cache/yum/*.
Hence it is good idea to check what were the previously installed packages and their count.


[root@redhat: ~ ]# cd /var/cache/yum/x86_64/;
[root@redhat: ~ ]# find . -iname '*.rpm'|wc -l


5. List RPM repositories set on the server


 [root@redhat: ~ ]# yum repolist
Loaded plugins: fastestmirror, versionlock
Repodata is over 2 weeks old. Install yum-cron? Or run: yum makecache fast
Determining fastest mirrors
repo id                                                                                 repo name                                                                                                            status
!atos-ac/7/x86_64                                                                       Atos Repository                                                                                                       3,128
!base/7/x86_64                                                                          CentOS-7 – Base                                                                                                      10,019
!cr/7/x86_64                                                                            CentOS-7 – CR                                                                                                         2,686
!epel/x86_64                                                                            Extra Packages for Enterprise Linux 7 – x86_64                                                                          165
!extras/7/x86_64                                                                        CentOS-7 – Extras                                                                                                       435
!updates/7/x86_64                                                                       CentOS-7 – Updates                                                                                                    2,500


This step is mandatory to make sure you're upgrading to latest packages from the right repositories for more concretics check what is inside in confs /etc/yum.repos.d/ ,  /etc/yum.conf 

6. Clean up any old rpm yum cache packages


This step is again mandatory but a good to follow just to have some more clearness on what packages is our upgrade downloading (not to mix up the old upgrades / installs with our newest one).
For documentation purposes all deleted packages list if such is to be kept under /root/logs/yumclean-install*.out file

[root@redhat: ~ ]# yum clean all |tee /root/logs/yumcleanall-$(hostname)-$(date '+%Y-%m-%d_%H-%M-%S').out


7. List the upgradeable packages's latest repository provided versions


[root@redhat: ~ ]# yum check-update |tee /root/logs/yumcheckupdate-$(hostname)-$(date '+%Y-%m-%d_%H-%M-%S').out


Then to be aware how many packages we'll be updating:


[root@redhat: ~ ]#  yum check-update | wc -l


8. Apply the actual uplisted RPM packages to be upgraded


[root@redhat: ~ ]# yum update |tee /root/logs/yumupdate-$(hostname)-$(date '+%Y-%m-%d_%H-%M-%S').out


Again output is logged to /root/logs/yumcheckupate-*.out 


9. Monitor downloaded packages count real time


To make sure yum upgrade is not in some hanging state and just get some general idea in which state of the upgrade is it e.g. Download / Pre-Update / Install  / Upgrade/ Post-Update etc.
in mean time when yum upgrade is running to monitor,  how many packages has the yum upgrade downloaded from remote RPM set repositories:


[root@redhat: ~ ]#  watch "ls -al /var/cache/yum/x86_64/7Server/…OS-repository…/packages/|wc -l"


10. Run status script to get the status again


[root@redhat: ~ ]# sh /root/ |tee /root/logs/status-before-$(hostname)-$(date '+%Y-%m-%d_%H-%M-%S').out


11. Add back versionlock for all RPM packs


Set all RPM packages installed on the RHEL / CentOS versionlock for all packages.


#==if needed
# yum versionlock \*



12. Get whether old software configuration is not messed up during the Package upgrade (Lookup the logs for .rpmsave and .rpmnew)


During the upgrade old RPM configuration is probably changed and yum did automatically save .rpmsave / .rpmnew saves of it thus it is a good idea to grep the prepared logs for any matches of this 2 strings :

[root@redhat: ~ ]#   grep -i ".rpm" /root/logs/yumupdate-server-host-2020-01-20_14-30-41.out
[root@redhat: ~ ]#  grep -i ".rpmsave" /root/logs/yumupdate-server-host-2020-01-20_14-30-41.out
[root@redhat: ~ ]#  grep -i ".rpmnew" /root/logs/yumupdate-server-host-2020-01-20_14-30-41.out

If above commands returns output usually it is fine if there is is .rpmnew output but, if you get grep output of .rpmsave it is a good idea to review the files compare with the original files that were .rpmsaved with the 
substituted config file and atune the differences with the changes manually made for some program functionality.

What are the .rpmsave / .rpmnew files ?
This files are coded files that got triggered by the RPM install / upgrade due to prewritten procedures on time of RPM build.


If a file was installed as part of a rpm, it is a config file (i.e. marked with the %config tag), you've edited the file afterwards and you now update the rpm then the new config file (from the newer rpm) will replace your old config file (i.e. become the active file).
The latter will be renamed with the .rpmsave suffix.

If a file was installed as part of a rpm, it is a noreplace-config file (i.e. marked with the %config(noreplace) tag), you've edited the file afterwards and you now update the rpm then your old config file will stay in place (i.e. stay active) and the new config file (from the newer rpm) will be copied to disk with the .rpmnew suffix.
See e.g. this table for all the details. 

In both cases you or some program has edited the config file(s) and that's why you see the .rpmsave / .rpmnew files after the upgrade because rpm will upgrade config files silently and without backup files if the local file is untouched.

After a system upgrade it is a good idea to scan your filesystem for these files and make sure that correct config files are active and maybe merge the new contents from the .rpmnew files into the production files. You can remove the .rpmsave and .rpmnew files when you're done.

If you need to get a list of all .rpmnew .rpmsave files on the server do:

[root@redhat: ~ ]#  find / -print | egrep "rpmnew$|rpmsave$


13. Reboot the system 

To check whether on next hang up or power outage the system will boot normally after the upgrade, reboot to test it.


you can :


[root@redhat: ~ ]#  reboot



[root@redhat: ~ ]#  shutdown -r now

or if on newer Linux with systemd in ues below systemctl

[root@redhat: ~ ]#  systemctl start


14. Get again the system status with our status script after reboot

[root@redhat: ~ ]#  sh /root/ |tee /root/logs/status-after-$(hostname)-$(date '+%Y-%m-%d_%H-%M-%S').out


15. Clean up any versionlocks if earlier set


[root@redhat: ~ ]# yum versionlock clear
[root@redhat: ~ ]# yum versionlock list


16. Check services and logs for problems


After the reboot Check closely all running services on system make sure every process / listening ports and services on the system are running fine, just like before the upgrade.
If the sytem had firewall,  check whether firewall rules are not broken, e.g. some NAT is not missing or anything earlier configured to automatically start via /etc/rc.local or some other
custom scripts were run and have done what was expected. 
Go through all the logs in /var/log that are most essential /var/log/boot.log , /var/log/messages … yum.log etc. that could reveal any issues after the boot. In case if running some application server or mail server check /var/log/mail.log or whenever it is configured to log.
If the system runs apache closely check the logs /var/log/httpd/error.log or php_errors.log for any strange errors that occured due to some issues caused by the newer installed packages.
Usually most of the cases all this should be flawless but a multiple check over your work is a stake for good results.

Fix eth changing network interface names from new Linux naming scheme ens, eno, em1 to legacy eth0, eth1, eth2 on CentOS Linux

Thursday, January 16th, 2020


On CentOS / RHEL 7 / Fedora 19+ and other Linux distributions, the default network eth0, eth1 .. interface naming scheme has been changed and in newer Linux kernels OS-es to names such as – ens3 , eno1, enp5s2, em1 etc.,  well known old scheme for eth* is now considered a legacy.
This new Network card naming in Linux OS is due to changes made in Kernel / modules and udev  rules which resembles how Ethernet ifaces are named on other UNIX like systems.
The weird name is taken depending on the Hardware Network card vendor name and is a standard for years in FreeBSD and Mac OSX, however this was not so over the years,
so for old school sysadmins that's pretty annoying as, we're much used to the eth0 / eth1 / eth2 / eth3 naming standard which brought some clearness on the network card naming.

Also for systems which are upgraded from old Linux OS distro releases to a newer ones, that includes this great new "cool" feature, that fits so well the New age-of computing Cloud craziness.
That behaviour could create a number of problems, especially if the already Production working servers due to failure to bring up some of the network devices after the upgrade or, even if you fix that by editting the /etc/network* / etc/sysconfig/networking/* by hand still there is even more stuff that won't work properly, such as any custom made iptables / ipset firewalls rules, or any kind of custom used third party Shell / Perl scripts that depend on the old-school conventional and (convenient easy to remember!!!) eth0, eth2 etc. naming

For sysadmins who are using some kind of Application Clustering with something like corosync / pacemaker this new fuzzy improvement makes things even worse as having a changed interface name of the card will break the cluster …


1. Get list of the LAN Card Server hardware


To get a better view on the server installed and recognized LAN Cards use lspci / dmidecode commands:

 lspci |grep -i Ether -A1 -B1
01:00.4 USB controller: Hewlett-Packard Company Integrated Lights-Out Standard Virtual USB Controller (r                                                                                                           ev 03)
02:00.0 Ethernet controller: Broadcom Inc. and subsidiaries NetXtreme BCM5719 Gigabit Ethernet PCIe (rev                                                                                                            01)
02:00.1 Ethernet controller: Broadcom Inc. and subsidiaries NetXtreme BCM5719 Gigabit Ethernet PCIe (rev                                                                                                            01)
02:00.2 Ethernet controller: Broadcom Inc. and subsidiaries NetXtreme BCM5719 Gigabit Ethernet PCIe (rev                                                                                                            01)
02:00.3 Ethernet controller: Broadcom Inc. and subsidiaries NetXtreme BCM5719 Gigabit Ethernet PCIe (rev                                                                                                            01)
03:00.0 RAID bus controller: Hewlett-Packard Company Smart Array Gen9 Controllers (rev 01)
05:00.0 Ethernet controller: Broadcom Inc. and subsidiaries NetXtreme BCM5719 Gigabit Ethernet PCIe (rev                                                                                                            01)
05:00.1 Ethernet controller: Broadcom Inc. and subsidiaries NetXtreme BCM5719 Gigabit Ethernet PCIe (rev                                                                                                            01)
05:00.2 Ethernet controller: Broadcom Inc. and subsidiaries NetXtreme BCM5719 Gigabit Ethernet PCIe (rev                                                                                                            01)
05:00.3 Ethernet controller: Broadcom Inc. and subsidiaries NetXtreme BCM5719 Gigabit Ethernet PCIe (rev                                                                                                            01)
7f:08.0 System peripheral: Intel Corporation Xeon E7 v3/Xeon E5 v3/Core i7 QPI Link 0 (rev 02)


lspci reports all attached LAN Cards to server which are plugged in on the Motherbord, since that specific server has a Motherboard integrated LAN Adapters too, we can see this one
via dmidecode.

# dmidecode |grep -i Ether -A 5 -B 5

Handle 0x00C5, DMI type 41, 11 bytes
Onboard Device
        Reference Designation: Embedded LOM 1 Port 3
        Type: Ethernet
        Status: Enabled
        Type Instance: 3
        Bus Address: 0000:XX:00.X

Handle 0x00C6, DMI type 41, 11 bytes
Onboard Device
        Reference Designation: Embedded LOM 1 Port 4
        Type: Ethernet
        Status: Enabled
        Type Instance: 4
        Bus Address: 0000:0X:00.X

Handle 0x00C7, DMI type 41, 11 bytes

                HP Ethernet 1Gb 4-port 331T Adapter – NIC
                Slot 2

Handle 0x00E3, DMI type 203, 34 bytes
OEM-specific Type
        Header and Data:


The illustrate the eth0 changing name issue, here is example taken from server on how eth1 interface is named on a new CentOS install:

# ip addr show

eno1: [BROADCAST,MULTICAST,UP,LOWER_UP] mtu 1500 qdisc pfifo_fast state UP qlen 1000
link/ether 6c:0b:84:6c:48:1c brd ff:ff:ff:ff:ff:ff
inet brd scope global eno1
inet6 2606:b400:c00:48:6e0b:84ff:fe6c:481c/128 scope global dynamic
valid_lft 2326384sec preferred_lft 339184sec
inet6 fe80::6e0b:84ff:fe6c:481c/64 scope link
valid_lft forever preferred_lft forever



2. Disable Network Manager on the server

To prevent potential problems for future with randomly changing Network card names order on reboots and other mess,
it is generally a good idea to disable Network Manager.


# systemctl disable NetworkManager
rm '/etc/systemd/system/'
rm '/etc/systemd/system/dbus-org.freedesktop.NetworkManager.service'
rm '/etc/systemd/system/dbus-org.freedesktop.nm-dispatcher.service'


3. Check and correct network configuration if necessery in  /etc/sysconfig/network-scripts/ifcfg-*

Either fix the naming across all files ifcfg-* to match eth0 / eth1 / ethXX or even better both change the NAME and DEVICE in files and completely rename the files ifcfg-eno1 to ifcfg-eth1 ..
ifcfg-enoXX to ifcfg-ethXX

server:~# cat /etc/sysconfig/network-scripts/ifcfg-eno1


4. Fix the interface scheme naming through passing a GRUB boot parameter to Kernel


a. Create backup of /etc/default/grub

cp -rpf /etc/default/grub /etc/default/grub_bak_date +"%Y_%m_%Y"

b. Edit /etc/default/grub

c. Find config parameter GRUB_CMDLINE_LINUX

d. Add net.ifnames=0 biosdevname=0 to the line


net.ifnames=0 biosdevname=0

After the change the line should look like

GRUB_CMDLINE_LINUX=" crashkernel=auto net.ifnames=0 biosdevname=0 rhgb quiet"


e. Regenerate GRUB loader to have included the new config

server:~# grub2-mkconfig -o /boot/grub2/grub.cfg

f. Reboot the sytem

server:~# shutdown -r now


5. Fix auto-generated inconvenient naming by modifying udev rules

The Mellanox Ehternet server card vendor's workaround to the ever changing eth names is modify udev rules to be able to have the ordinary eth0 / eth1 / eth2 … Lan card name scheme.
In short this is recommended for Mellanox but should work on any other Lan card device attached on a Linux powered server.

# cat /etc/sysconfig/network-scripts/ifcfg-eth1


# cat /etc/sysconfig/network-scripts/ifcfg-eth2


# vi /etc/udev/rules.d/70-persistent-net.rules

SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="7c:fe:90:cb:76:02", ATTR{dev_id}=="0x0", ATTR{type}=="1", KERNEL=="eth*", NAME="eth1"

SUBSYSTEM=="net", ACTION=="add", DRIVERS=="?*", ATTR{address}=="7c:fe:90:cb:76:03", ATTR{dev_id}=="0x0", ATTR{type}=="1", KERNEL=="eth*", NAME="eth2"


Next step is to reboot.

# /sbin/reboot

After a while when the server boots check with ip or ifconfig the configuration to make sure the ethXX ordering is proper again.


# /sbin/ifconfig eth1
eth1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500
inet netmask broadcast
inet6 fe80::7efe:90ff:fecb:7602 prefixlen 64 scopeid 0x20<link>
ether 7c:fe:90:cb:76:02 txqueuelen 1000 (Ethernet)
RX packets 0 bytes 0 (0.0 B)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 23 bytes 3208 (3.1 KiB)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
eth2: flags=4099<UP,BROADCAST,MULTICAST> mtu 1500
ether 7c:fe:90:cb:76:03 txqueuelen 1000 (Ethernet)
RX packets 0 bytes 0 (0.0 B)
RX errors 0 dropped 0 overruns 0 frame 0
TX packets 0 bytes 0 (0.0 B)
TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0

On some Linux distributions, if it happens this udev extra configuration is not venerated, use net.ifnames=0 biosdevname=0 grub configuration.

6. Verify eth interfaces are present    

# ip addr show


eth0: [BROADCAST,MULTICAST,UP,LOWER_UP] mtu 1500 qdisc pfifo_fast state UP qlen 1000

link/ether 6c:0b:84:6c:48:1c brd ff:ff:ff:ff:ff:ff

inet brd scope global eno1

inet6 2606:b400:c00:48:6e0b:84ff:fe6c:481c/128 scope global dynamic

valid_lft 2326384sec preferred_lft 339184sec

inet6 fe80::6e0b:84ff:fe6c:481c/64 scope link

valid_lft forever preferred_lft forever

That's all this should put an end to the annoying auto generated naming lan device naming.




So what was explained up was how to resolve problems caused by autogenerated ethernet interface cards by a new functionality in the Linux kernel, so Network cards are again visible via ip address show / ifconfig again in a proper order eth0 / eth1 / eth2 / eth3 etc. instead of a vendor generated cryptic names as ens / eno / em etc. This is possible via either by editing udev rules or grub configuration. Doing so saves nerves and makes sysadmin life better, at least it did mine.
That's all this should put an end to the annoying auto generated naming.

How to debug failing service in systemctl and add a new IP network alias in CentOS Linux

Wednesday, January 15th, 2020


If you get some error with some service that is start / stopped via systemctl you might be pondering how to debug further why the service is not up then then you'll be in the situation I was today.
While on one configured server with 8 eth0 configured ethernet network interfaces the network service was reporting errors, when atempted to restart the RedHat way via:

service network restart

to further debug what the issue was as it was necessery I had to find a way how to debug systemctl so here is how:


How to do a verbose messages status for sysctlct?


linux:~# systemctl status network

linux:~# systemctl status network


Another useful hint is to print out only log messages for the current boot, you can that with:

# journalctl -u service-name.service -b


if you don't want to have the less command like page separation ( paging ) use the –no-pager argument.


# journalctl -u network –no-pager

Jan 08 17:09:14 lppsq002a network[8515]: Bringing up interface eth5:  [  OK  ]

    Jan 08 17:09:15 lppsq002a network[8515]: Bringing up interface eth6:  [  OK  ]
    Jan 08 17:09:15 lppsq002a network[8515]: Bringing up interface eth7:  [  OK  ]
    Jan 08 17:09:15 lppsq002a systemd[1]: network.service: control process exited, code=exited status=1
    Jan 08 17:09:15 lppsq002a systemd[1]: Failed to start LSB: Bring up/down networking.
    Jan 08 17:09:15 lppsq002a systemd[1]: Unit network.service entered failed state.
    Jan 08 17:09:15 lppsq002a systemd[1]: network.service failed.
    Jan 15 11:04:45 lppsq002a systemd[1]: Starting LSB: Bring up/down networking…
    Jan 15 11:04:45 lppsq002a network[55905]: Bringing up loopback interface:  [  OK  ]
    Jan 15 11:04:45 lppsq002a network[55905]: Bringing up interface eth0:  RTNETLINK answers: File exists
    Jan 15 11:04:45 lppsq002a network[55905]: [  OK  ]
    Jan 15 11:04:45 lppsq002a network[55905]: Bringing up interface eth1:  RTNETLINK answers: File exists
    Jan 15 11:04:45 lppsq002a network[55905]: [  OK  ]
    Jan 15 11:04:46 lppsq002a network[55905]: Bringing up interface eth2:  ERROR     : [/etc/sysconfig/network-scripts/ifup-eth] Device eth2 has different MAC address than expected, ignoring.
    Jan 15 11:04:46 lppsq002a network[55905]: [FAILED]
    Jan 15 11:04:46 lppsq002a network[55905]: Bringing up interface eth3:  RTNETLINK answers: File exists
    Jan 15 11:04:46 lppsq002a network[55905]: [  OK  ]
    Jan 15 11:04:46 lppsq002a network[55905]: Bringing up interface eth4:  ERROR     : [/etc/sysconfig/network-scripts/ifup-eth] Device eth4 does not seem to be present, delaying initialization.
    Jan 15 11:04:46 lppsq002a network[55905]: [FAILED]
    Jan 15 11:04:46 lppsq002a network[55905]: Bringing up interface eth5:  RTNETLINK answers: File exists
    Jan 15 11:04:46 lppsq002a network[55905]: [  OK  ]
    Jan 15 11:04:46 lppsq002a network[55905]: Bringing up interface eth6:  RTNETLINK answers: File exists
    Jan 15 11:04:47 lppsq002a network[55905]: [  OK  ]
    Jan 15 11:04:47 lppsq002a network[55905]: Bringing up interface eth7:  RTNETLINK answers: File exists
    Jan 15 11:04:47 lppsq002a network[55905]: [  OK  ]
    Jan 15 11:04:47 lppsq002a network[55905]: RTNETLINK answers: File exists
    Jan 15 11:04:47 lppsq002a network[55905]: RTNETLINK answers: File exists
    Jan 15 11:04:47 lppsq002a network[55905]: RTNETLINK answers: File exists
    Jan 15 11:04:47 lppsq002a network[55905]: RTNETLINK answers: File exists
    Jan 15 11:04:47 lppsq002a network[55905]: RTNETLINK answers: File exists
    Jan 15 11:04:47 lppsq002a network[55905]: RTNETLINK answers: File exists
    Jan 15 11:04:47 lppsq002a network[55905]: RTNETLINK answers: File exists
    Jan 15 11:04:47 lppsq002a network[55905]: RTNETLINK answers: File exists
    Jan 15 11:04:47 lppsq002a network[55905]: RTNETLINK answers: File exists
    Jan 15 11:04:47 lppsq002a systemd[1]: network.service: control process exited, code=exited status=1
    Jan 15 11:04:47 lppsq002a systemd[1]: Failed to start LSB: Bring up/down networking.
    Jan 15 11:04:47 lppsq002a systemd[1]: Unit network.service entered failed state.
    Jan 15 11:04:47 lppsq002a systemd[1]: network.service failed.
    Jan 15 11:08:22 lppsq002a systemd[1]: Starting LSB: Bring up/down networking…
    Jan 15 11:08:22 lppsq002a network[56841]: Bringing up loopback interface:  [  OK  ]
    Jan 15 11:08:22 lppsq002a network[56841]: Bringing up interface eth0:  RTNETLINK answers: File exists
    Jan 15 11:08:22 lppsq002a network[56841]: [  OK  ]
    Jan 15 11:08:26 lppsq002a network[56841]: Bringing up interface eth1:  RTNETLINK answers: File exists
    Jan 15 11:08:26 lppsq002a network[56841]: [  OK  ]
    Jan 15 11:08:26 lppsq002a network[56841]: Bringing up interface eth2:  ERROR     : [/etc/sysconfig/network-scripts/ifup-eth] Device eth2 has different MAC address than expected, ignoring.
    Jan 15 11:08:26 lppsq002a network[56841]: [FAILED]
    Jan 15 11:08:26 lppsq002a network[56841]: Bringing up interface eth3:  RTNETLINK answers: File exists
    Jan 15 11:08:27 lppsq002a network[56841]: [  OK  ]



Another useful thing debug arguments is the -xe to do:

# journalctl -xe –no-pager


  • -x (– catalog)
    Augment log lines with explanation texts from the message catalog.
    This will add explanatory help texts to log messages in the output
    where this is available.
  •  -e ( –pager-end )  Immediately jump to the end of the journal inside the implied pager


Finally after fixing the /etc/sysconfig/networking-scripts/* IP configuration issues I had all the 8 Ethernet interfaces to work as expected

# systemctl status network




2. Adding a new IP alias to eth0 interface

Further on I had  to add an IP Alias on the CenOS via its networking configuration, this is done by editing /etc/sysconfig/network-scripts/ifcfg* files.
To create an IP alias for first lan interface eth0, I've had to created a new file named ifcfg-eth0:0

linux:~# cd /etc/sysconfig/network-scripts/
linux:~# vim ifcfg-eth0:0

with below content


Adding this IP address network alias works across all RPM based distributions and should work also on Fedora and Open SuSE as well as Suse Enterprise Linux.
If you however prefer to use a text GUI and do it the CentOS server administration way you can use nmtui (Text User Interface for controlling NetworkManager). tool.

linux:~# nmtui




How to install jcmd on CentOS 7 to diagnose running Java Virtual Machine crashing applications

Tuesday, January 14th, 2020


jcmd utility is well known in the Brane New wonderful world of Java but if you're like me a classical old school sysadmins and non-java developer you probably never heard it hence before going straight into how to install it on CentOS 7 Linux servers, I'll shortly say few words on what it is.
jcmd is used to send diagnostic requests to running Java Virtual Machine (JVM) it is available in both in Oracle Java as well as OpenJDK.
The requests jcmd sends to VM are based on the running Java PID ID and are pretty useful for controlling Java Flight Recordings, troubleshoot, and diagnose JVM and Java Applications. It must be used on the same machine where the JVM is running.

 Used without arguments or with the -l option, jcmd prints the list of running Java processes with their process id, their main class and their command line arguments.
When a main class is specified on the command line, jcmd sends the diagnostic command request to all Java processes for which the command line argument is a substring of the Java process' main class.
jcmd could be useful if the JConsoleJMX (Java Management Extensions) can't be used for some reason on the server or together with Java Visual VM (visual interface for viewing detailed info about Java App).

In most Linux distributionsas as  of year 2020 jcmd is found in  java-*-openjdk-headless.

To have jcmd on lets say Debian GNU / Linux, you're up to something like:


apt-get install –yes openjdk-12-jdk-headless


apt-get install openjdk-11-jdk-headless
however in CentOS
7 jcmd is not found in java*openjdk*headless but instead to have it on server, thus it take me a while to look up where it is foundso after hearing
from some online post it is part of package java*openjdk*devel* to make sure this so true, I've used the  –download-only option

 yum install –downloadonly –downloaddir=/tmp java-1.8.0-openjdk-devel


So the next question was how to inspect the downloaded rpm package into /tmp usually, this is possible via Midnight Commander (mc) easily to view contents, however as this
server did not have installed mc due to security policies I had to do it differently after pondering a while on how to to list the RPM package file content come up using following command



 rpm2cpio java-1.8.0-openjdk-devel- |cpio -idmv|grep -i jcmd|less


To then install  java-1.8.0-openjdk-devel- to do so run:




yum -y install  java-1.8.0-openjdk-devel-1.8.0*

Once the jcmd, I've created the following bash script  that was set that was tracking for application errors and checking whether the JBoss application server pool-available-count is not filled up and hence jboss refuses to serve connections  through query automatically launching jcmd to get various diagnostic data about Java Virtual Machine (e.g. a running snapshot) – think of it like the UNIX top for debugging or Windows System Monitor but run one time. 




# PID_OF_JAVA=$(pgrep -l java)
# jcmd $PID_OF_JAVA GC.heap_dump GC.heap_dump_file-$(date '+%Y-%m-%d_%H-%M-%S').jfr
# jcmd $PID_OF_JAVA Thread.print > Thread.print-$(date '+%Y-%m-%d_%H-%M-%S').jfr

The produced log files can then be used by the developer to visualize some Java specific stuff "Flight recordings" like in below screenshot:


If you're interested on some other interesting tools that can be used to Monitor  and Debug a Running Java VM take a look at Java's official documentation Monitoring Tools.
So that's all Mission Accomplished 🙂 Now the Java Application developer could observe the log and tell why exactly the application crashed after the multitude of thrown Exceptions in the JBoss server.log.