Archive for February, 2020

Improve DNS lookup domain resolve speed on Linux / UNIX servers through /etc/resolv.conf timeout, attempts, rorate options

Thursday, February 27th, 2020

If you're an performance optimization freak and you want to optimize your Linux servers to perform better in terms of DNS resolve slowness because of failing DNS resolve queries due to Domain Name Server request overload or due to Denial of Service attack towards it. It might be interesting to mention about some little known functionalities of /etc/resolv.conf described in the manual page.

The defined nameservers under /etc/resolv.conf are queried one by one waiting for responce of the sent DNS resolve request if it is not replied from the first one for some time, the 2nd one is queried until a responce is received by any of the defined nameserver IPs

A default /etc/resolv.conf on a new Linux server install looks something like this:


However one thing is that defined if NS1 dies out due to anything, it takes timeout time until the second or 3rd working one takes over to resolve the query.
This is controlled by the timeout value.

Below is description from man page

sets the amount of time the resolver will wait for a
response from a remote name server before retrying the
query via a different name server.  Measured in
seconds, the default is RES_TIMEOUT (currently 5, see
<resolv.h>).  The value for this option is silently
capped to 30.


  • In other words Timeout value is time to resolving IP address from hostname through DNS server,timeout option is to reduce hostname lookup time

As you see from manual default is 5 seconds which is quite high, thus reducing the value to 3 secs or even 1 seconds is a good sysadmin practice IMHO.

Another value that could be tuned in /etc/resolv.conf is attempts value below is what the manual says about it: 

                     Sets the number of times the resolver will send a query to its name servers before giving up and returning an error to the calling application.  The default is RES_DFLRETRY (cur‐
                     rently 2, see <resolv.h>).  The value for this option is silently capped to 5.



  • This means default behaviour on a failing DNS query resolve is to try to resend the DNS resolve request to the failing nameserver 5 more times, that is quite high thus it is a good practice from my experience to reduce it to something as 2 or 1

Another very useful resolv.conf value is rotate
The default behavior of how DNS outgoing Domain requests are handled is to use only the primary defined DNS, instead if you need to do a load balancing in a round-robin manner add to conf rotate option.

The final /etc/resolv.conf optimized would look like so:


linux# cat /etc/resolv.conf

options ndots:1
options timeout:1
options attempts:1
options rotate

The search opt. placement is also important to be placed in the right location in the file. The correct placement is after the nameservers defined, I have to say in older Linux distributions the correct placement of search option was to be on top of resolv.conf.

Note that this configuration is good and fits not only Linux but also is a good DNS lookup optimization speed on other UNIX derivatives such as FreeBSD / NetBSD as well as other Proprietary OS UNIX machines running IBM AIX etc.

On Linux it is also possible to place the options given in one single line like so, below is the config I have on my running Lenovo server:


options timeout:2 attempts:1 rotate


When is /etc/hosts record venerated and when is /etc/resolv.conf DNS defined queried for a defined DNS host?


One important thing to know when dealing with /etc/resolv.conf  is what happens if a Name domain is defined in both /etc/hosts and /etc/resolv.conf.
For example you have a domain record in /etc/hosts to a certain domain
but the DNS nameserver in Google has a record to an IP that is the real IP


# dig @

; <<>> DiG 9.11.5-P4-5.1-Debian <<>> @
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 54656
;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 1

; EDNS: version: 0, flags:; udp: 512
;                  IN      A

;; ANSWER SECTION:           3599    IN      A

;; Query time: 40 msec
;; WHEN: чт фев 27 18:04:23 EET 2020
;; MSG SIZE  rcvd: 57


  • Which of the 2 different IPs will the applications installed on the server such as Apache / Squid / MySQL / tinyproxy for their DNS resolve operations?


Now it is time to say few words about /etc/nsswitch.conf (The Nameserver switching configuration file). This file defines the DNS resolve file used order in which the Operationg System does IP to domain translation and backwards.

# grep -i hosts: /etc/nsswitch.conf

hosts:          files dns myhostname

As you can see first the local defined in files like /etc/hosts record is venerated when resolving, then it is the externally configured DNS resolver IPs from /etc/resolv.conf.

nsswitch.conf  is used also for defining where the OS will look up for user / passwd (e.g. login credentials) on login, on systems which are having an LDAP authentication via the sssd (system security services daemon) via definitions like:


passwd:     files sss
shadow:     files sss
group:      files sss

E.g. the user login will be first try to read from local /etc/passwd , /etc/shadow , /etc/groups and if no matched record is found then the LDAP service the sssd is queried.

Fix Blocked Unresponsive keyboard keys on Windows 7 / 10

Tuesday, February 25th, 2020



The Problem

If you're still using Windows 7 Operating system in your company due to some weird security concern policies and your company gets custom end user updates from Microsoft due to a special EULA agreement and you are new to using Windows E.g. have for many users used for your daily work Linux and Mac OS you might hit a strange issue wtih many of the keyboard keys strangely being locked with some of the keys such as Num Lock, Escape tab working where the Alphabet keys then don't panic. This is not a Windows bug its a feature (as usual) 🙂

Reason behind Blocked Unresponsive Keyboard


First logical thought I had is maybe my Logitech K120 Membrane 17 EURO cheap keyboard externally attached keyboard broke up thus I've tried to connect another LOGIC keyboard I had at hand just to assure myself the problem with partial keys on kbd reacting was present with the other Working keyboard as well.
This was an indicator that either the custom installed Windows by the company Helpdesk Office with the preassumed common additional features for corporations such as Keyloggers on this Laptop has messed up somohow the Windows service that is managing the keyboard or some kind of mechanical error or electronic circuit on the laptop embedded keyboard has occured or the KBD DLL Loaded driver damaged
I have to say here a colleague of mine was having a weird keyboard problems back in the day when I was still working in Project Services as a Web and Middleware in Hewlett Packard, where misteriously a character was added to his typed content just like a key on his keyboard has stuck and he experienced this issue for quite some time, he opened the keyboard to physically check whether all is okay and even checked the keyboard electricity whether levels on each of the keys and he couldn't find anything, and after running Malware Bytes anti-malware and a couple of other anti-malware programs which found his computer was infected with Malware and issue resolved.


Hard Fix – Reboot ..


As as a solution most times so far I've restarted the Windows which was reloading the Windows kernel / DLL libraries etc.

However just hit to this Windows accessibility feature once again today and since this is not the first time I end up with unworking keyboard (perhaps due to my often) furious fast typing – where I press sometimes multiple keys in parallel as a typing error then you trigger the Windows Disabled accessibiltiy Windows Feature, which as I thought makes the PC only usable for Mouse but unusuable for providing any meaningful keyboard input.

This problem I've faced already multiple times and usually the work around was the good known Windows User recipee phrase "Restart and It will get fixed", this time I was pissed off and didn't wanted to loose another 5 minutes in Restarting Reconnecting to the Company's Cisco Secure VPN reopening all my used files Notepad++ / Outlook / Browsers etc plus I was already part of online Lync (Skype) Meeting in which Colleague was Sharing his remote Desktop checking some important stuff about Zabbix Monitored AIX machine, hence didn't wanted to restart but still wanted to use my computer and type some stuff to send Email and do a simple googling.

Temporary work around to complete work with Virtual Keyboard

Hence as a temporary work around, I've used the Windows Virtual Keyboard, I've mentioned it in the earlier  blog post – How to run Virtual Keyboard in Windows XP / Vista article
To do so I'verun by typing osk command in cmd.exe command Prompt:

Either Search for osk.exe from Start menu


or run via command line via

Windows Button (on the Keyboard) + R and run

cmd.exe -> osk




Solution Without PC Restart

After a bit of thought and Googling I've found the fix  here

From the Start -> Control Panel from here I had to go to Accessibility Options.
Select Ease of Access Center.


Select the keyboard settings and
Ensure the following options are unchecked: Turn on Sticky Keys, Turn on Toggle Keys and Turn on Filter Keys.


I've found in the Turn on Toggle Keys tick present (e.g. service was enabled) – hence  after unticking it and 

Press Apply and OK, keyboard restored its usual functions.
Now all left was to  Enjoy as your keyboard was back usable and I could conitnue my Citrix sessions and SSH console Superputty terminals  and complete my started to write E-mail
without loosing time meanlessly for reboot.


Windows 10 Disable the Filter Keys option


This feature makes your keyboard ignore brief or repeated keystrokes, which might have led to your WinKey issue in Windows 10. To disable filter keys, use the instructions below:

1. Right-click on your Start menu icon.
2. Select Settings from the menu.
3. Navigate to Ease of Access and click on it.
4. Go to the left pane and click Keyboard.
5. Locate the Filter Keys feature.
6. Toggle it off.
7. Check if this manoeuvre has resolved your issue.

Closing Notes 

Of course this might be not always the fix, as sometimes it could be that the Winblows just blows your keyboard buffer due to some buggy application or a bug, but in most of the times that should solve it 🙂
If it didn't go through and debug all the other possible reasons, check whether you have a faulty keyboard cable (if you're still on a non-bluetooth Wired Keyboard), unplug and plug the keyboard again,
scan the computer for spyware and malware, rethink what really happened or what have you done until the problem occured and whether blocked keyboard is triggered by your user action or was triggered
by some third party software anti-virus stuff that did it as an attempt to prevent keylog sniffer / Virus or other weird stuff.

How to check if shared library is loaded in AIX OS – Fix missing

Thursday, February 20th, 2020


I've had to find out whether an externally Linux library is installed  on AIX system and whether something is not using it.
The returned errors was like so:


# gpg –export -a

Could not load program gpg:
Dependent module /opt/custom/lib/libreadline.a( could not be loaded.
Member is not found in archive

After a bit of investigation, I found that gpg was failing cause it linked to older version of, the workaround was to just substitute the newer version of over the original installed one.

Thus I had a plan to first find out whether this libreadline.a is loaded and recognized by AIX UNIX first and second find out whether some of the running processes is not using that library.
I've come across this interesting IBM official documenation that describes pretty good insights on how to determine whether a shared library  is currently loaded on the system. which mentions the genkld command that is doing
exactly what I needed.

In short:
genkld – creates a list that is printed to the console that shows all loaded shared libraries


Next I used lsof (list open files) command to check whether there is in real time opened libraries by any of the running programs on the system.

After not finding anything and was sure the library is neither loaded as a system library in AIX nor it is used by any of the currently running AIX processes, I was sure I could proceed to safely overwrite libreadline.a ( with libreadline.a with (

The result of that is again a normally running gpg as ldd command shows the binary is again normally linked to its dependend system libraries.

aix# ldd /usr/bin/gpg
/usr/bin/gpg needs:



# gpg –version
gpg (GnuPG) 1.4.22
Copyright (C) 2015 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.


Home: ~/.gnupg
Supported algorithms:
Hash: MD5, SHA1, RIPEMD160, SHA256, SHA384, SHA512, SHA224
Compression: Uncompressed, ZIP, ZLIB, BZIP2



How to enable HaProxy logging to a separate log /var/log/haproxy.log / prevent HAProxy duplicate messages to appear in /var/log/messages

Wednesday, February 19th, 2020

haproxy  logging can be managed in different form the most straight forward way is to directly use /dev/log either you can configure it to use some log management service as syslog or rsyslogd for that.

If you don't use rsyslog yet to install it: 

# apt install -y rsyslog

Then to activate logging via rsyslogd we can should add either to /etc/rsyslogd.conf or create a separte file and include it via /etc/rsyslogd.conf with following content:

Enable haproxy logging from rsyslogd

Log haproxy messages to separate log file you can use some of the usual syslog local0 to local7 locally used descriptors inside the conf (be aware that if you try to use some wrong value like local8, local9 as a logging facility you will get with empty haproxy.log, even though the permissions of /var/log/haproxy.log are readable and owned by haproxy user.

When logging to a local Syslog service, writing to a UNIX socket can be faster than targeting the TCP loopback address. Generally, on Linux systems, a UNIX socket listening for Syslog messages is available at /dev/log because this is where the syslog() function of the GNU C library is sending messages by default. To address UNIX socket in haproxy.cfg use:

log /dev/log local2 

If you want to log into separate log each of multiple running haproxy instances with different haproxy*.cfg add to /etc/rsyslog.conf lines like:

local2.* -/var/log/haproxylog2.log
local3.* -/var/log/haproxylog3.log

One important note to make here is since rsyslogd is used for haproxy logging you need to have enabled in rsyslogd imudp and have a UDP port listener on the machine.

E.g. somewhere in rsyslog.conf or via rsyslog include file from /etc/rsyslog.d/*.conf needs to have defined following lines:

$ModLoad imudp
$UDPServerRun 514

I prefer to use external /etc/rsyslog.d/20-haproxy.conf include file that is loaded and enabled rsyslogd via /etc/rsyslog.conf:

# vim /etc/rsyslog.d/20-haproxy.conf

$ModLoad imudp
$UDPServerRun 514​
local2.* -/var/log/haproxy2.log

It is also possible to produce different haproxy log output based on the severiy to differentiate between important and less important messages, to do so you'll need to rsyslog.conf something like:

# Creating separate log files based on the severity
local0.* /var/log/haproxy-traffic.log
local0.notice /var/log/haproxy-admin.log


Prevent Haproxy duplicate messages to appear in /var/log/messages

If you use local2 and some default rsyslog configuration then you will end up with the messages coming from haproxy towards local2 facility producing doubled simultaneous records to both your pre-defined /var/log/haproxy.log and /var/log/messages on Proxy servers that receive few thousands of simultanous connections per second.
This is a problem since doubling the log will produce too much data and on systems with smaller /var/ partition you will quickly run out of space + this haproxy requests logging to /var/log/messages makes the file quite unreadable for normal system events which are so important to track clearly what is happening on the server daily.

To prevent the haproxy duplicate messages you need to define somewhere in rsyslogd usually /etc/rsyslog.conf local2.none near line of facilities configured to log to file:

*.info;mail.none;authpriv.none;cron.none;local2.none     /var/log/messages

This configuration should work but is more rarely used as most people prefer to have haproxy log being written not directly to /dev/log which is used by other services such as syslogd / rsyslogd.

To use /dev/log to output logs from haproxy configuration in global section use config like:

        log /dev/log local2 debug
        chroot /var/lib/haproxy
        stats socket /run/haproxy/admin.sock mode 660 level admin
        stats timeout 30s
        user haproxy
        group haproxy

The log global directive basically says, use the log line that was set in the global section for whole config till end of file. Putting a log global directive into the defaults section is equivalent to putting it into all of the subsequent proxy sections.

Using global logging rules is the most common HAProxy setup, but you can put them directly into a frontend section instead. It can be useful to have a different logging configuration as a one-off. For example, you might want to point to a different target Syslog server, use a different logging facility, or capture different severity levels depending on the use case of the backend application. 

Insetad of using /dev/log interface that is on many distributions heavily used by systemd to store / manage and distribute logs,  many haproxy server sysadmins nowdays prefer to use rsyslogd as a default logging facility that will manage haproxy logs.
Admins prefer to use some kind of mediator service to manage log writting such as rsyslogd or syslog, the reason behind might vary but perhaps most important reason is  by using rsyslogd it is possible to write logs simultaneously locally on disk and also forward logs  to a remote Logging server  running rsyslogd service.

Logging is defined in /etc/haproxy/haproxy.cfg or the respective configuration through global section but could be also configured to do a separate logging based on each of the defined Frontend Backends or default section. 
A sample exceprt from this section looks something like:

# Global settings
    log local2

    chroot      /var/lib/haproxy
    pidfile     /var/run/
    maxconn     4000
    user        haproxy
    group       haproxy

    # turn on stats unix socket
    stats socket /var/lib/haproxy/stats

    mode                    tcp
    log                     global
    option                  tcplog
    #option                  dontlognull
    #option http-server-close
    #option forwardfor       except
    option                  redispatch
    retries                 7
    #timeout http-request    10s
    timeout queue           10m
    timeout connect         30s
    timeout client          20m
    timeout server          10m
    #timeout http-keep-alive 10s
    timeout check           30s
    maxconn                 3000



# HAProxy Monitoring Config
listen stats                #Haproxy Monitoring run on port 8080
    mode http
    option httplog
    option http-server-close
    stats enable
    stats show-legends
    stats refresh 5s
    stats uri /stats                            #URL for HAProxy monitoring
    stats realm Haproxy\ Statistics
    stats auth hproxyauser:Password___          #User and Password for login to the monitoring dashboard


# frontend which proxys to the backends
frontend ft_DKV_PROD_WLPFO
    mode tcp
    option tcplog
    log-format %ci:%cp\ [%t]\ %ft\ %b/%s\ %Tw/%Tc/%Tt\ %B\ %ts\ %ac/%fc/%bc/%sc/%rc\ %sq/%bq
    default_backend Default_Bakend_Name

# round robin balancing between the various backends
backend bk_DKV_PROD_WLPFO
    mode tcp
    # (0) Load Balancing Method.
    balance source
    # (4) Peer Sync: a sticky session is a session maintained by persistence
    stick-table type ip size 1m peers hapeers expire 60m
    stick on src
    # (5) Server List
    # (5.1) Backend
    server Backend_Server1 check port 18088
    server Backend_Server2 check port 18088 backup

The log directive in above config instructs HAProxy to send logs to the Syslog server listening at Messages are sent with facility local2, which is one of the standard, user-defined Syslog facilities. It’s also the facility that our rsyslog configuration is expecting. You can add more than one log statement to send output to multiple Syslog servers.

Once rsyslog and haproxy logging is configured as a minumum you need to restart rsyslog (assuming that haproxy config is already properly loaded):

# systemctl restart rsyslogd.service

To make sure rsyslog reloaded successfully:

systemctl status rsyslogd.service

Restarting HAproxy

If the rsyslogd logging to port 514 was recently added a HAProxy restart should also be run, you can do it with:

# /usr/sbin/haproxy -f /etc/haproxy/haproxy.cfg -D -p /var/run/ -sf $(cat /var/run/

Or to restart use systemctl script (if haproxy is not used in a cluster with corosync / heartbeat).

# systemctl restart haproxy.service

You can control how much information is logged by adding a Syslog level by

    log local2 info

The accepted values are the standard syslog security level severity:

Value Severity Keyword Deprecated keywords Description Condition
0 Emergency emerg panic System is unusable A panic condition.
1 Alert alert   Action must be taken immediately A condition that should be corrected immediately, such as a corrupted system database.
2 Critical crit   Critical conditions Hard device errors.
3 Error err error Error conditions  
4 Warning warning warn Warning conditions  
5 Notice notice   Normal but significant conditions Conditions that are not error conditions, but that may require special handling.
6 Informational info   Informational messages  
7 Debug debug   Debug-level messages Messages that contain information normally of use only when debugging a program.


Logging only errors / timeouts / retries and errors is done with option:

Note that if the rsyslog is configured to listen on different port for some weird reason you should not forget to set the proper listen port, e.g.:

  log local2 info

option dontlog-normal

in defaults or frontend section.

You most likely want to enable this only during certain times, such as when performing benchmarking tests.

(or log-format-sd for structured-data syslog) directive in your defaults or frontend

Haproxy Logging shortly explained

The type of logging you’ll see is determined by the proxy mode that you set within HAProxy. HAProxy can operate either as a Layer 4 (TCP) proxy or as Layer 7 (HTTP) proxy. TCP mode is the default. In this mode, a full-duplex connection is established between clients and servers, and no layer 7 examination will be performed. When in TCP mode, which is set by adding mode tcp, you should also add option tcplog. With this option, the log format defaults to a structure that provides useful information like Layer 4 connection details, timers, byte count and so on.

Below is example of configured logging with some explanations:

Log-format "%ci:%cp [%t] %ft %b/%s %Tw/%Tc/%Tt %B %ts %ac/%fc/%bc/%sc/%rc %sq/%bq"

Example of Log-Format configuration as shown above outputted of haproxy config:

Log-format "%ci:%cp [%tr] %ft %b/%s %TR/%Tw/%Tc/%Tr/%Ta %ST %B %CC %CS %tsc %ac/%fc/%bc/%sc/%rc %sq/%bq %hr %hs %{+Q}r"


To understand meaning of this abbreviations you'll have to closely read  haproxy-log-format.txt. More in depth info is to be found in HTTP Log format documentation


Logging HTTP request headers

HTTP request header can be logged via:

 http-request capture

frontend website
    bind :80
    http-request capture req.hdr(Host) len 10
    http-request capture req.hdr(User-Agent) len 100
    default_backend webservers

The log will show headers between curly braces and separated by pipe symbols. Here you can see the Host and User-Agent headers for a request: [20/Dec/2018:22:20:00.899] website~ webservers/server1 0/0/1/0/1 200 462 – – —- 1/1/0/0/0 0/0 {|Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Ubuntu Chromium/71.0.3578.80 } "GET / HTTP/1.1"


Haproxy Stats Monitoring Web interface

Haproxy is having a simplistic stats interface which if enabled produces some general useful information like in above screenshot, through which
you can get a very basic in browser statistics and track potential issues with the proxied traffic for all configured backends / frontends incoming outgoing
network packets configured nodes
 experienced downtimes etc.


The basic configuration to make the stats interface accessible would be like pointed in above config for example to enable network listener on address

with hproxyuser / password config would be:

# HAProxy Monitoring Config
listen stats                #Haproxy Monitoring run on port 8080
    mode http
    option httplog
    option http-server-close
    stats enable
    stats show-legends
    stats refresh 5s
    stats uri /stats                            #URL for HAProxy monitoring
    stats realm Haproxy\ Statistics
    stats auth hproxyauser:Password___          #User and Password for login to the monitoring dashboard



Sessions states and disconnect errors on new application setup

Both TCP and HTTP logs include a termination state code that tells you the way in which the TCP or HTTP session ended. It’s a two-character code. The first character reports the first event that caused the session to terminate, while the second reports the TCP or HTTP session state when it was closed.

Here are some essential termination codes to track in for in the log:

Here are some termination code examples most commonly to see on TCP connection establishment errors:

Two-character code    Meaning
—    Normal termination on both sides.
cD    The client did not send nor acknowledge any data and eventually timeout client expired.
SC    The server explicitly refused the TCP connection.
PC    The proxy refused to establish a connection to the server because the process’ socket limit was reached while attempting to connect.

To get all non-properly exited codes the easiest way is to just grep for anything that is different from a termination code –, like that:

tail -f /var/log/haproxy.log | grep -v ' — '

This should output in real time every TCP connection that is exiting improperly.

There’s a wide variety of reasons a connection may have been closed. Detailed information about all possible termination codes can be found in the HAProxy documentation.
To get better understanding a very useful reading to haproxy Debug errors with  is in haproxy-logging.txt in that small file are collected all the cryptic error messages codes you might find in your logs when you're first time configuring the Haproxy frontend / backend and the backend application behind.

Another useful analyze tool which can be used to analyze Layer 7 HTTP traffic is halog for more on it just google around.

Procedure Instructions to safe upgrade CentOS / RHEL Linux 7 Core to latest release

Thursday, February 13th, 2020


Generally upgrading both RHEL and CentOS can be done straight with yum tool just we're pretty aware and mostly anyone could do the update, but it is good idea to do some
steps in advance to make backup of any old basic files that might help us to debug what is wrong in case if the Operating System fails to boot after the routine Machine OS restart
after the upgrade that is usually a good idea to make sure that machine is still bootable after the upgrade.

This procedure can be shortened or maybe extended depending on the needs of the custom case but the general framework should be useful anyways to someone that's why
I decided to post this.

Before you go lets prepare a small status script which we'll use to report status of  sysctl installed and enabled services as well as the netstat connections state and
configured IP addresses and routing on the system.

The script to be used during our different upgrade stages:

# script status ###
echo "STARTED: $(date '+%Y-%m-%d_%H-%M-%S'):" | tee /root/logs/yumcheckupdate-$(hostname)-$(date '+%Y-%m-%d_%H-%M-%S').out
systemctl list-unit-files –type=service | grep enabled
systemctl | grep ".service" | grep "running"
netstat -tulpn
netstat -r
ip a s
/sbin/route -n
echo "ENDED $(date '+%Y-%m-%d_%H-%M-%S'):" | tee /root/logs/yumcheckupdate-$(hostname)-$(date '+%Y-%m-%d_%H-%M-%S').out


– Save the script in any file like /root/

– Make the /root/logs directoriy.

[root@redhat: ~ ]# mkdir /root/logs
[root@redhat: ~ ]# vim /root/
[root@redhat: ~ ]# chmod +x /root/


1. Get a dump of CentOS installed version release and grub-mkconfig generated os_probe


[root@redhat: ~ ]# cat /etc/redhat-release  > /root/logs/redhat-release-vorher-$(hostname)-$(date '+%Y-%m-%d_%H-%M-%S').out
[root@redhat: ~ ]# cat /etc/grub.d/30_os-prober > /root/logs/grub2-efi-vorher-$(hostname)-$(date '+%Y-%m-%d_%H-%M-%S').out


2. Clear old versionlock marked RPM packages (if there are such)


On servers maintained by multitude of system administrators just like the case is inside a Global Corporations and generally in the corporate world , where people do access the systems via LDAP and more than a single person
has superuser privileges. It is a good prevention measure to use yum package management  functionality to RPM based Linux distributions called  versionlock.
versionlock for those who hear it for a first time is locking the versions of the installed RPM packages so if someone by mistake or on purpose decides to do something like :

[root@redhat: ~ ]# yum install packageversion

Having the versionlock set will prevent the updated package to be installed with a different branch package version.

Also it will prevent a playful unknowing person who just wants to upgrade the system without any deep knowledge to be able to

[root@redhat: ~ ]# yum upgrade

update and leave the system in unbootable state, that will be only revealed during the next system reboot.

If you haven't used versionlock before and you want to use it you can do it with:

[root@redhat: ~ ]# yum install yum-plugin-versionlock

To add all the packages for compiling C code and all the interdependend packages, you can do something like:


[root@redhat: ~ ]# yum versionlock gcc-*

If you want to clear up the versionlock, once it is in use run:

[root@redhat: ~ ]#  yum versionlock clear
[root@redhat: ~ ]#  yum versionlock list


3.  Check RPC enabled / disabled


This step is not necessery but it is a good idea to check whether it running on the system, because sometimes after upgrade rpcbind gets automatically started after package upgrade and reboot. 
If we find it running we'll need to stop and mask the service.


# check if rpc enabled
[root@redhat: ~ ]# systemctl list-unit-files|grep -i rpc
var-lib-nfs-rpc_pipefs.mount                                      static
auth-rpcgss-module.service                                        static
rpc-gssd.service                                                  static
rpc-rquotad.service                                               disabled
rpc-statd-notify.service                                          static
rpc-statd.service                                                 static
rpcbind.service                                                   disabled
rpcgssd.service                                                   static
rpcidmapd.service                                                 static
rpcbind.socket                                                    disabled                                                 static                                                    static

[root@redhat: ~ ]# systemctl status rpcbind.service
● rpcbind.service – RPC bind service
   Loaded: loaded (/usr/lib/systemd/system/rpcbind.service; disabled; vendor preset: enabled)
   Active: inactive (dead)


[root@redhat: ~ ]# systemctl status rpcbind.socket
● rpcbind.socket – RPCbind Server Activation Socket
   Loaded: loaded (/usr/lib/systemd/system/rpcbind.socket; disabled; vendor preset: enabled)
   Active: inactive (dead)
   Listen: /var/run/rpcbind.sock (Stream)
           [::]:111 (Stream)
           [::]:111 (Datagram)


4. Check any previously existing downloaded / installed RPMs (check yum cache)


yum install package-name / yum upgrade keeps downloaded packages via its operations inside its cache directory structures in /var/cache/yum/*.
Hence it is good idea to check what were the previously installed packages and their count.


[root@redhat: ~ ]# cd /var/cache/yum/x86_64/;
[root@redhat: ~ ]# find . -iname '*.rpm'|wc -l


5. List RPM repositories set on the server


 [root@redhat: ~ ]# yum repolist
Loaded plugins: fastestmirror, versionlock
Repodata is over 2 weeks old. Install yum-cron? Or run: yum makecache fast
Determining fastest mirrors
repo id                                                                                 repo name                                                                                                            status
!atos-ac/7/x86_64                                                                       Atos Repository                                                                                                       3,128
!base/7/x86_64                                                                          CentOS-7 – Base                                                                                                      10,019
!cr/7/x86_64                                                                            CentOS-7 – CR                                                                                                         2,686
!epel/x86_64                                                                            Extra Packages for Enterprise Linux 7 – x86_64                                                                          165
!extras/7/x86_64                                                                        CentOS-7 – Extras                                                                                                       435
!updates/7/x86_64                                                                       CentOS-7 – Updates                                                                                                    2,500


This step is mandatory to make sure you're upgrading to latest packages from the right repositories for more concretics check what is inside in confs /etc/yum.repos.d/ ,  /etc/yum.conf 

6. Clean up any old rpm yum cache packages


This step is again mandatory but a good to follow just to have some more clearness on what packages is our upgrade downloading (not to mix up the old upgrades / installs with our newest one).
For documentation purposes all deleted packages list if such is to be kept under /root/logs/yumclean-install*.out file

[root@redhat: ~ ]# yum clean all |tee /root/logs/yumcleanall-$(hostname)-$(date '+%Y-%m-%d_%H-%M-%S').out


7. List the upgradeable packages's latest repository provided versions


[root@redhat: ~ ]# yum check-update |tee /root/logs/yumcheckupdate-$(hostname)-$(date '+%Y-%m-%d_%H-%M-%S').out


Then to be aware how many packages we'll be updating:


[root@redhat: ~ ]#  yum check-update | wc -l


8. Apply the actual uplisted RPM packages to be upgraded


[root@redhat: ~ ]# yum update |tee /root/logs/yumupdate-$(hostname)-$(date '+%Y-%m-%d_%H-%M-%S').out


Again output is logged to /root/logs/yumcheckupate-*.out 


9. Monitor downloaded packages count real time


To make sure yum upgrade is not in some hanging state and just get some general idea in which state of the upgrade is it e.g. Download / Pre-Update / Install  / Upgrade/ Post-Update etc.
in mean time when yum upgrade is running to monitor,  how many packages has the yum upgrade downloaded from remote RPM set repositories:


[root@redhat: ~ ]#  watch "ls -al /var/cache/yum/x86_64/7Server/…OS-repository…/packages/|wc -l"


10. Run status script to get the status again


[root@redhat: ~ ]# sh /root/ |tee /root/logs/status-before-$(hostname)-$(date '+%Y-%m-%d_%H-%M-%S').out


11. Add back versionlock for all RPM packs


Set all RPM packages installed on the RHEL / CentOS versionlock for all packages.


#==if needed
# yum versionlock \*



12. Get whether old software configuration is not messed up during the Package upgrade (Lookup the logs for .rpmsave and .rpmnew)


During the upgrade old RPM configuration is probably changed and yum did automatically save .rpmsave / .rpmnew saves of it thus it is a good idea to grep the prepared logs for any matches of this 2 strings :

[root@redhat: ~ ]#   grep -i ".rpm" /root/logs/yumupdate-server-host-2020-01-20_14-30-41.out
[root@redhat: ~ ]#  grep -i ".rpmsave" /root/logs/yumupdate-server-host-2020-01-20_14-30-41.out
[root@redhat: ~ ]#  grep -i ".rpmnew" /root/logs/yumupdate-server-host-2020-01-20_14-30-41.out

If above commands returns output usually it is fine if there is is .rpmnew output but, if you get grep output of .rpmsave it is a good idea to review the files compare with the original files that were .rpmsaved with the 
substituted config file and atune the differences with the changes manually made for some program functionality.

What are the .rpmsave / .rpmnew files ?
This files are coded files that got triggered by the RPM install / upgrade due to prewritten procedures on time of RPM build.


If a file was installed as part of a rpm, it is a config file (i.e. marked with the %config tag), you've edited the file afterwards and you now update the rpm then the new config file (from the newer rpm) will replace your old config file (i.e. become the active file).
The latter will be renamed with the .rpmsave suffix.

If a file was installed as part of a rpm, it is a noreplace-config file (i.e. marked with the %config(noreplace) tag), you've edited the file afterwards and you now update the rpm then your old config file will stay in place (i.e. stay active) and the new config file (from the newer rpm) will be copied to disk with the .rpmnew suffix.
See e.g. this table for all the details. 

In both cases you or some program has edited the config file(s) and that's why you see the .rpmsave / .rpmnew files after the upgrade because rpm will upgrade config files silently and without backup files if the local file is untouched.

After a system upgrade it is a good idea to scan your filesystem for these files and make sure that correct config files are active and maybe merge the new contents from the .rpmnew files into the production files. You can remove the .rpmsave and .rpmnew files when you're done.

If you need to get a list of all .rpmnew .rpmsave files on the server do:

[root@redhat: ~ ]#  find / -print | egrep "rpmnew$|rpmsave$


13. Reboot the system 

To check whether on next hang up or power outage the system will boot normally after the upgrade, reboot to test it.


you can :


[root@redhat: ~ ]#  reboot



[root@redhat: ~ ]#  shutdown -r now

or if on newer Linux with systemd in ues below systemctl

[root@redhat: ~ ]#  systemctl start


14. Get again the system status with our status script after reboot

[root@redhat: ~ ]#  sh /root/ |tee /root/logs/status-after-$(hostname)-$(date '+%Y-%m-%d_%H-%M-%S').out


15. Clean up any versionlocks if earlier set


[root@redhat: ~ ]# yum versionlock clear
[root@redhat: ~ ]# yum versionlock list


16. Check services and logs for problems


After the reboot Check closely all running services on system make sure every process / listening ports and services on the system are running fine, just like before the upgrade.
If the sytem had firewall,  check whether firewall rules are not broken, e.g. some NAT is not missing or anything earlier configured to automatically start via /etc/rc.local or some other
custom scripts were run and have done what was expected. 
Go through all the logs in /var/log that are most essential /var/log/boot.log , /var/log/messages … yum.log etc. that could reveal any issues after the boot. In case if running some application server or mail server check /var/log/mail.log or whenever it is configured to log.
If the system runs apache closely check the logs /var/log/httpd/error.log or php_errors.log for any strange errors that occured due to some issues caused by the newer installed packages.
Usually most of the cases all this should be flawless but a multiple check over your work is a stake for good results.