list Archives - ☩ Walking in Light with Christ - Faith, Computing, Diary ☩ Walking in Light with Christ

Posts Tagged ‘list’

How to count number of ESTABLISHED state TCP connections to a Windows server

Wednesday, March 13th, 2024

Even if you have the background of a Linux system administrator, sooner or later you will have have to deal with some Windows hosts, thus i'll blog in this article shortly on how the established TCP if it happens you will have to administarte a Windows hosts or help a windows sysadmin noobie 🙂

In Linux it is pretty easy to check the number of established conenctions, because of the wonderful command wc (word count). with a simple command like:

$ netstat -etna |wc -l

Then you will get the number of active TCP connections to the machine and based on that you can get an idea on how busy the server is.

But what if you have to deal with lets say a Microsoft Windows 2012 /2019 / 2020 or 2022 Server, assuming you logged in as Administrator and you see the machine is quite loaded and runs multiple Native Windows Administrator common services such as IIS / Active directory Failover Clustering, Proxy server etc.
How can you identify the established number of connections via a simple command in cmd.exe?

1.Count ESTABLISHED TCP connections from Windows Command Line

Here is the answer, simply use netstat native windows command and combine it with find, like that and use the /i (ignores the case of characters when searching the string) /c (count lines containing the string) options

C:\Windows\system32>netstat -p TCP -n| find /i "ESTABLISHED" /c
1268

Voila, here are number of established connections, only 1268 that is relatively low.
However if you manage Windows servers, and you get some kind of hang ups as part of the monitoring, it is a good idea to setup a script based on this simple command for at least Windows Task Scheduler (the equivallent of Linux's crond service) to log for Peaks in Established connections to see whether Server crashes are not related to High Rise in established connections.
Even better if company uses Zabbix / Nagios, OpenNMS or other old legacy monitoring stuff like Joschyd even as of today 2024 used in some big of the TOP IT companies such as SAP (they were still using it about 4 years ago for their SAP HANA Cloud), you can set the script to run and do a Monitoring template or Alerting rules to draw you graphs and Trigger Alerts if your connections hits a peak, then you at least might know your Windows server is under a "Hackers" Denial of Service attack or there is something happening on the network, like Cisco Network Infrastructure Switch flappings or whatever.

Perhaps an example script you can use if you decide to implement the little nestat established connection checks Monitoring in Zabbix is the one i've writen about in the previous article "Calculate established connection from IP address with shell script and log to zabbix graphic".

2. Few Useful netstat options for the Windows system admin

C:\Windows\System32> netstat -bona

Cmd.exe will lists executable files, local and external IP addresses and ports, and the state in list form. You immediately see which programs have created connections or are listening so that you can find offenders quickly.

b – displays the executable involved in creating the connection.
o – displays the owning process ID.
n – displays address and port numbers.
a – displays all connections and listening ports.

As you can see in the screenshot, by using netstat -bona you get which process has binded to which local address and the Process ID PID of it, that is pretty useful in debugging stuff.

3. Use a Third Party GUI tool to debug more interactively connection issues

If you need to keep an eye in interactive mode, sometimes if there are issues CurrPorts tool can be of a great help

currports-windows-network-connections-diagnosis-cports

CurrPorts Tool own Description

CurrPorts is network monitoring software that displays the list of all currently opened TCP/IP and UDP ports on your local computer. For each port in the list, information about the process that opened the port is also displayed, including the process name, full path of the process, version information of the process (product name, file description, and so on), the time that the process was created, and the user that created it.
In addition, CurrPorts allows you to close unwanted TCP connections, kill the process that opened the ports, and save the TCP/UDP ports information to HTML file , XML file, or to tab-delimited text file.
CurrPorts also automatically mark with pink color suspicious TCP/UDP ports owned by unidentified applications (Applications without version information and icons).

Sum it up

What we learned is how to calculate number of established TCP connections from command line, useful for scripting, how you can use netstat to display the process ID and Process name that relates to a used Local / Remote TCP connections, and how eventually you can use this to connect it to some monitoring tool to periodically report High Peaks with TCP established connections (usually an indicator of servere system issues).

Tags: blog, cmd, connections, count tcp connections, created, CurrPorts, find, hang ups, How to, idea, information, linux?, list, machine, netstat, number, ports, process, script, simple, string, tcp, use, Windows, windows server
Posted in Curious Facts, Monitoring, System Administration, System Optimization, Various, Windows | No Comments »

Must have software on freshly installed windows – Essential Software after fresh Windows install

Friday, March 18th, 2016

If you're into IT industry even if you don't like installing frequently Windows or you're completely Linux / BSD user, you will certainly have a lot of friends which will want help from you to re-install or fix their Windows 7 / 8 / 10 OS. At least this is the case with me every year, I'm kinda of obliged to install fresh windowses on new bought friends or relatives notebooks / desktop PCs.

Of course according to for whom the new Windows OS installed the preferrences of necessery software varies, however more or less there is sort of standard list of Windows Software which is used daily by most of Avarage Computer user, such as:

CD Mounting ISO software ( VirtualClone Drive )
CD Recording ( CDBurnerXP )
Archiving / Unarchiving (Zip Rar) Software ( 7-Zip, Winrar, PeaZip )
Office Software ( Apache Open Office – also known widely as LibreOffice )
Internet Browsers ( Mozilla Firefox, Opera, Google Chrome )
Antivirus ( Avira Antivirus or AVG or Kaspersky, Nod32, McAffee (however rarely install non-freeware AV, usually only for companies or organizations) )
Anti-Malware ( Malware Bytes, SpyBot – Search & Destroy )
Chat Audio / Video Conferencing ( Skype / Pidgin (ICQ Client) / Telegram (A free software multiple platform Chat messanger alternative to Skype) )
Video Player ( VLC (Video Lan Client), KMplayer, GOM Player, Quick time, K-lite Codecs )
Music Player ( Winamp (nowadays discontinued), foobar2000, Audacity )
Graphic Design / Graphic Editor / Web Design tools ( The Gimp, Inkscape, PhotoShop and CorelDraw )
Picture Viewer ( IrfanView, FastStone, Picasa )
Java ( Java 7 / 8 JRE and at some PCs where necessery JDK )
PDF Viewers / DJViewPDF ( Adobe Acrobat Reader, Foxit Reader, Sumatra PDF, WinDJView )
Flash Player ( Adobe Shockware Flash player, Adobe Air (install it only on some PCs) )
SSH Client ( MobaXterm (Home Edition), PuTTY (Free Telnet SSH Client) )
FTP Client ( WinSCP, FileZilla )
Programming Text Editor ( Notepad++, Scintilla (SciTE), GVIM, Eclipse (only on developers and programmers PCs) )
Virtualization Software ( VirtualBox, VMWare (install VMWare only on Admin PCs))
Remote Desktop and VNC connector program ( RealVNC, TeamViewer )
Bittorrent Client ( uTorrent or qBittorrent or Transmission )
E-Mail POP3 / Imap Client ( Mozilla Thunderbird )
Screenshotting Tool ( GreenShot )
Password Safe program ( Password Safe or KeePass )
Remote Backup and Storage (Cloud) synchronization ( Dropbox, GoogleDrive, OneDrive (Comes preinstalled on Win 8.1 and Windows 10 )
Tools
– UnixUtils – Installs following common Linux command package tools for Windows CLI

bc-1.05, bison-1.28, bzip2-1.0.2, diffutils-2.7, fileutils-3.16, findutils-4.1, flex-2.5.4, gawk-3.1.0
grep-2.4.2, gsar110, gzip-1.2.4, indent-2.2.9, jwhois-2.4.1, less-340, m4-1.4, make-3.78.1
patch-2.5, recode-3.6, rman-3.0.7, sed-3.02, shellutils-1.9.4, tar-1.12, textutils-2.1
unrar-3.00, wget-1.8.2, which-2.4

– (WinDirStat or SpaceSniffer – Tools that can show you which is the biggest files and directory inside a directory tree)
– WinGrep (Grep: Grep for Windows)
– Everest Home Edition ( Hardware System Information – Shows you what is the PC hardware )

Not to forget a good candidate from the list to install on new fresh windows Installation candidates are:

Winrar
PeaZIP
WinZip
GreenShot (to be able to easily screenshot stuff and save pictures locally and to the cloud)
AnyDesk (non free but very functional alternative to TeamViewer) to be able to remotely access remote PC
TightVNC
ITunes / Spotify (for people who have also iPhone smart phone)
DropBox or pCloud (to have some extra cloud free space)
FBReader (for those reading a lot of books in different formats)
Rufus – Rufus is an efficient and lightweight tool to create bootable USB drives. It helps you to create BIOS or UEFI bootable devices. It helps you to create Windows TO Go drives. It provides support for various disk, format, and partition.
Recuva is a data recovery software for Windows 10 (non free)
EaseUS (for specific backup / restore data purposes but unfortunately (non free)
For designers
Adobe Photoshop
Adobe Illustrator
f.lux – to control brightness of screen and potentially Save your eyes
ImDisk virtual Disk Driver
KeePass / PasswordSafe – to Securely store your passwords
Putty / MobaXterm / SecureCRT / mPutty (for system administrators and programmers that has to deal with Linux / UNIX)

I tend to install on New Windows installs and thus I have more or less systematized the process.

I try to usually stick to free software where possible for each of the above categories as a Free Software enthusiast and luckily nowadays there is a lot of non-priprietary or at least free as in beer software available out there.

For Windows sysadmins or College and other public institutions networks including multiple of Windows Computers which are not inside a domain and also for people in computer repair shops where daily dozens of windows pre-installs or a set of software Automatic updates are necessery make sure to take a look at Ninite

ninite-automate-windows-program-deploy-and-update-on-new-windows-os-openoffice-screenshot

As official website introduces Ninite:

Ninite – Install and Update All Your Programs at Once

Of course as Ninite is used by organizations as NASA, Harvard Medical School etc. it is likely the tool might reports your installed list of Windows software and various other Win PC statistical data to Ninite developers and most likely NSA, but this probably doesn't much matter as this is probably by the moment you choose to have installed a Windows OS on your PC.

ninite-choises-to-build-an-install-package-with-useful-essential-windows-software-screenshot

For Windows System Administrators managing small and middle sized network PCs that are not inside a Domain Controller, Ninite could definitely save hours and at cases even days of boring install and maintainance work. HP Enterprise or HP Inc. Employees or ex-employees would definitely love Ninite, because what Ninite does is pretty much like the well known HP Internal Tool PC COE.

Ninite could also prepare an installer containing multiple applications based on the choice on Ninite's website, so that's also a great thing especially if you need to deploy a different type of Users PCs (Scientific / Gamers / Working etc.)

Perhaps there are also other useful things to install on a new fresh Windows installations, if you're using something I'm missing let me know in comments.

Tags: Archiving Unarchiving Zip Rar Software, case, Chat, com, common, course, daily, data, directory, dozens, Eclipse, fix, free software, inside, installed, list, look, lot, matter, multiple, necessery, New Windows, notebooks, official, OS, Pc, possible, process, public institutions, remote desktop, software, sort, standard, system administrators, things, various, website, Windows, windows computers, windows software, www
Posted in Everyday Life, Remote System Administration, System Administration, Various, Windows | 1 Comment »

Install btop on Debian Linux, btop an advanced htop like monitoring for Linux to beautify your console life

Tuesday, May 30th, 2023

I've accidently stubmled on btop a colorful and interactive ncurses like command line utility to provide you a bunch of information about CPU / memory / disks and processes with nice console graphic in the style of Cubic Player 🙂
Those who love htop and like their consoles to be full of shiny colors, will really appreciate those nice Linux monitoring tool.
To install btop on latest current stable Debian bullseyes, you will have to install it via backports, as the regular Debian repositories does not have the tool available out of the box.

To Add backports packages support for your Debian 11:

1. Edit /etc/apt/sources.list and include following repositories

# vim /etc/apt/sources.list

deb http://deb.debian.org/debian bullseye-backports main contrib non-free
deb-src http://deb.debian.org/debian bullseye-backports main contrib non-free

2. Update the known repos list to include it

# apt update
…

3. Install the btop deb package from backports

# apt-cache show btop|grep -A 20 -i descrip
Description-en: Modern and colorful command line resource monitor that shows usage and stats
btop is a modern and colorful command line resource monitor that shows
usage and stats for processor, memory, disks, network and processes.
btop features:
– Easy to use, with a game inspired menu system.
– Full mouse support, all buttons with a highlighted key is clickable
and mouse scroll works in process list and menu boxes.
– Fast and responsive UI with UP, DOWN keys process selection.
– Function for showing detailed stats for selected process.
– Ability to filter processes.
– Easy switching between sorting options.
– Tree view of processes.
– Send any signal to selected process.
– UI menu for changing all config file options.
– Auto scaling graph for network usage.
– Shows IO activity and speeds for disks
– Battery meter
– Selectable symbols for the graphs
– Custom presets
– And more…
btop is written in C++ and is continuation of bashtop and bpytop.
Description-md5: 73df6c70fe01f5bf05cca0e3031c1fe2
Multi-Arch: foreign
Homepage: https://github.com/aristocratos/btop
Section: utils
Priority: optional
Filename: pool/main/b/btop/btop_1.2.7-1~bpo11+1_amd64.deb
Size: 431500
SHA256: d79e35c420a2ac5dd88ee96305e1ea7997166d365bd2f30e14ef57b556aecb36

# apt install -t bullsye-backports btop –yes
…

Once I installed it, I can straight use it except on some of my Linux machines, which were having a strange encoding $LANG defined, those ones spitted some errors like:

root@freak:~# btop
ERROR: No UTF-8 locale detected!
Use –utf-force argument to force start if you're sure your terminal can handle it.

To work around it simply redefine LANG variable and rerun it

# export LANG=en_US.UTF8

# btop

Tags: colorful, command line utility, consoles, debian linux, Install, LANG, life, list, mouse, processor
Posted in Linux, Monitoring, Various | 3 Comments »

Migration of audit messages from snoopy to auditd

Tuesday, April 20th, 2010

his article may be out of date and may be deleted in the future.

This article explains the migration from the previous service "Snoopy" to "Auditd". Only commands that are executed as a user with root rights should be recorded here.

Uninstall/disable snoopy

Configuration of auditd

Files needed
Auditd start/stop script

/etc/init.d/auditd

Rules for monitoring by auditd

/etc/audit/audit.rules

Auditd plugin for syslog service

/etc/audisp/plugins.d/syslog.conf

Edit the /etc/audit/audit.rules file
Auditd can be specifically configured to capture and exclude messages. The following list is helpful for excluding certain event entries ("msgtype"):

* 1000 – 1099 are for commanding the audit system
* 1100 – 1199 user space trusted application messages
* 1200 – 1299 messages internal to the audit daemon
* 1300 – 1399 audit event messages
* 1400 – 1499 kernel SE Linux use
* 1500 – 1599 AppArmor events
* 1600 – 1699 kernel crypto events
* 1700 – 1799 kernel abnormal records
* 1800 – 1999 future kernel use (maybe integrity labels and related events)
* 2001 – 2099 unused (kernel)
* 2100 – 2199 user space anomaly records
* 2200 – 2299 user space actions taken in response to anomalies
* 2300 – 2399 user space generated LSPP events
* 2400 – 2499 user space crypto events
* 2500 – 2999 future user space (maybe integrity labels and related events)
Adding the rules

In order for auditd to record the desired events, rules must be defined.

List of rules set up
Below is a list and explanation of the rules set up:

-a exclude,always -F msgtype>=2400 -F msgtype<=2499
-a exclude,always -F msgtype=PATH
-a exclude,always -F msgtype=CWD
-a exclude,always -F msgtype=EOE
-a exit,always -F arch=b64 -F auid!=0 -F auid!=4294967295 -S execve
-a exit,always -F arch=b32 -F auid!=0 -F auid!=4294967295 -S execve

The first rule excludes crypto events in user space – these include, for example, messages about a user logging in.
The second through fourth rules remove the information not necessary for monitoring before it is logged.
The fifth and sixth rules capture the commands entered by users moving within an interactive shell. Services etc. executed by the system are therefore not recorded.
It should be noted here that a separate rule must be created for systems that contain both 32- and 64-bit commands and libraries.

Rule syntax

In general, it makes sense to keep the number of existing rules low in order to reduce the load. Therefore, if possible, several rule fields (-F option) should be combined in one rule. Since Auditd obviously has a problem with multiple event entries that are defined in plain text, these have been created in individual rules. The syntax description of the individual rules is given in the next listing:

-a contains the instructions
The action value "exclude" and the list value "always" are specified for rules that should not lead to any log entry
The action values "exit" and "always" have been specified for rules that should lead to a log entry
"exit" stands for a log entry after the command has been executed
-F defines a rules field
Depending on the application, the rules defined here filter by event entry ("msgtype"), architecture ("arch") and login UID ("auid").
-S stands for the syscall. In the rules that should lead to a log entry, the value "execve" is monitored – i.e. when commands are executed.

Redirect to syslog

Within the file /etc/audisp/plugins.d/syslog.conf the value

active = no
on

active = yes
set.

restart auditd with the command

/etc/init.d/auditd restart
the settings are accepted.

Additional information

The following man pages can be consulted for more information:

auditctl
audit.rules
auditd
auditd.conf

Tags: command, Configuration, event, future, init, kernel, lead, list, messages, value
Posted in Linux, System Administration | No Comments »

Megaraid SAS software installation on CentOS Linux

Saturday, October 20th, 2012

With a standard el5 on a new Dell server, it may be necessary to install the Dell Raid driver, otherwise the OMSA always reports an error and hardware monitoring is therefore obsolete:

Previously, the megaraid_sys package was now called mptlinux

For this we need the following packages in advance:

# yum install gcc kernel-devel
Now the driver stuff:

# yum install dkms mptlinux
That should have built the new module, better test it:

# modinfo mptsas

# dkms status
After a kernel update it may be necessary to build the driver for the new version:

# dkms build -m mptlinux -v 4.00.38.02

# dkms install -m mptlinux -v 4.00.38.02

Tags: based, better, called, Dell Raid, Display, Enable X11, gcc, Getting X11, installation, list, need, packages, somehost, ssh, standard, tell, update, working, xauth, yum
Posted in Linux, Linux and FreeBSD Desktop | No Comments »

How to RPM update Hypervisors and Virtual Machines running Haproxy High Availability cluster on KVM, Virtuozzo without a downtime on RHEL / CentOS Linux

Friday, May 20th, 2022

Here is the scenario, lets say you have on your daily task list two Hypervisor (HV) hosts running CentOS or RHEL Linux with KVM or Virutozzo technology and inside the HV hosts you have configured at least 2 pairs of virtual machines one residing on HV Host 1 and one residing on HV Host 2 and you need to constantly keep the hosts to the latest distribution major release security patchset.

The Virtual Machines has been running another set of Redhat Linux or CentOS configured to work in a High Availability Cluster running Haproxy / Apache / Postfix or any other kind of HA solution on top of corosync / keepalived or whatever application cluster scripts Free or Open Source technology that supports a switch between clustered Application nodes.

The logical question comes how to keep up the CentOS / RHEL Machines uptodate without interfering with the operations of the Applications running on the cluster?

Assuming that the 2 or more machines are configured to run in Active / Passive App member mode, e.g. one machine is Active at any time and the other is always Passive, a switch is possible between the Active and Passive node.

HAProxy--Load-Balancer-cluster-2-nodes-your-Servers

In this article I'll give a simple step by step tested example on how you I succeeded to update (for security reasons) up to the latest available Distribution major release patchset on one by one first the Clustered App on Virtual Machines 1 and VM2 on Linux Hypervisor Host 1. Then the App cluster VM 1 / VM 2 on Hypervisor Host 2.
And finally update the Hypervisor1 (after moving the Active resources from it to Hypervisor2) and updating the Hypervisor2 after moving the App running resources back on HV1.
I know the procedure is a bit monotonic but it tries to go through everything step by step to try to mitigate any possible problems. In case of failure of some rpm dependencies during yum / dnf tool updates you can always revert to backups so in anyways don't forget to have a fully functional backup of each of the HV hosts and the VMs somewhere on a separate machine before proceeding further, any possible failures due to following my aritcle literally is your responsibility 🙂

0. Check situation before the update on HVs / get VM IDs etc.

Check the virsion of each of the machines to be updated both Hypervisor and Hosted VMs, on each machine run:

# cat /etc/redhat-release
CentOS Linux release 7.9.2009 (Core)

The machine setup I'll be dealing with is as follows:

hypervisor-host1 -> hypervisor-host1.fqdn.com
• virt-mach-centos1
• virt-machine-zabbix-proxy-centos (zabbix proxy)

hypervisor-host2 -> hypervisor-host2.fqdn.com
• virt-mach-centos2
• virt-machine-zabbix2-proxy-centos (zabbix proxy)

To check what is yours check out with virsh cmd –if on KVM or with prlctl if using Virutozzo, you should get something like:

[root@hypervisor-host2 ~]# virsh list
Id Name State
—————————————————-
1 vm-host1 running
2 virt-mach-centos2 running

# virsh list –all

[root@hypervisor-host1 ~]# virsh list
Id Name State
—————————————————-
1 vm-host2 running
3 virt-mach-centos1 running

[root@hypervisor-host1 ~]# prlctl list
UUID STATUS IP_ADDR T NAME
{dc37c201-08c9-589d-aa20-9386d63ce3f3} running – VM virt-mach-centos1
{76e8a5f8-caa8-5442-830e-aa4bfe8d42d9} running – VM vm-host2
[root@hypervisor-host1 ~]#

If you have stopped VMs with Virtuozzo to list the stopped ones as well.

# prlctl list -a

[root@hypervisor-host2 74a7bbe8-9245-5385-ac0d-d10299100789]# vzlist -a
CTID NPROC STATUS IP_ADDR HOSTNAME
[root@hypervisor-host2 74a7bbe8-9245-5385-ac0d-d10299100789]# prlctl list
UUID STATUS IP_ADDR T NAME
{92075803-a4ce-5ec0-a3d8-9ee83d85fc76} running – VM virt-mach-centos2
{74a7bbe8-9245-5385-ac0d-d10299100789} running – VM vm-host1

# prlctl list -a

If due to Virtuozzo version above command does not return you can manually check in the VM located folder, VM ID etc.

[root@hypervisor-host2 vmprivate]# ls
74a7bbe8-9245-4385-ac0d-d10299100789 92075803-a4ce-4ec0-a3d8-9ee83d85fc76
[root@hypervisor-host2 vmprivate]# pwd
/vz/vmprivate
[root@hypervisor-host2 vmprivate]#

[root@hypervisor-host1 ~]# ls -al /vz/vmprivate/
total 20
drwxr-x—. 5 root root 4096 Feb 14 2019 .
drwxr-xr-x. 7 root root 4096 Feb 13 2019 ..
drwxr-x–x. 4 root root 4096 Feb 18 2019 1c863dfc-1deb-493c-820f-3005a0457627
drwxr-x–x. 4 root root 4096 Feb 14 2019 76e8a5f8-caa8-4442-830e-aa4bfe8d42d9
drwxr-x–x. 4 root root 4096 Feb 14 2019 dc37c201-08c9-489d-aa20-9386d63ce3f3
[root@hypervisor-host1 ~]#

Before doing anything with the VMs, also don't forget to check the Hypervisor hosts has enough space, otherwise you'll get in big troubles !

[root@hypervisor-host2 vmprivate]# df -h
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/centos_hypervisor-host2-root 20G 1.8G 17G 10% /
devtmpfs 20G 0 20G 0% /dev
tmpfs 20G 0 20G 0% /dev/shm
tmpfs 20G 2.0G 18G 11% /run
tmpfs 20G 0 20G 0% /sys/fs/cgroup
/dev/sda1 992M 159M 766M 18% /boot
/dev/mapper/centos_hypervisor-host2-home 9.8G 37M 9.2G 1% /home
/dev/mapper/centos_hypervisor-host2-var 9.8G 355M 8.9G 4% /var
/dev/mapper/centos_hypervisor-host2-vz 755G 25G 692G 4% /vz

[root@hypervisor-host1 ~]# df -h
Filesystem Size Used Avail Use% Mounted on
/dev/mapper/centos-root 50G 1.8G 45G 4% /
devtmpfs 20G 0 20G 0% /dev
tmpfs 20G 0 20G 0% /dev/shm
tmpfs 20G 2.1G 18G 11% /run
tmpfs 20G 0 20G 0% /sys/fs/cgroup
/dev/sda2 992M 153M 772M 17% /boot
/dev/mapper/centos-home 9.8G 37M 9.2G 1% /home
/dev/mapper/centos-var 9.8G 406M 8.9G 5% /var
/dev/mapper/centos-vz 689G 12G 643G 2% /vz

Another thing to do before proceeding with update is to check and tune if needed the amount of CentOS repositories used, before doing anything with yum.

[root@hypervisor-host2 yum.repos.d]# ls -al
total 68
drwxr-xr-x. 2 root root 4096 Oct 6 13:13 .
drwxr-xr-x. 110 root root 12288 Oct 7 11:13 ..
-rw-r–r–. 1 root root 4382 Mar 14 2019 CentOS7.repo
-rw-r–r–. 1 root root 1664 Sep 5 2019 CentOS-Base.repo
-rw-r–r–. 1 root root 1309 Sep 5 2019 CentOS-CR.repo
-rw-r–r–. 1 root root 649 Sep 5 2019 CentOS-Debuginfo.repo
-rw-r–r–. 1 root root 314 Sep 5 2019 CentOS-fasttrack.repo
-rw-r–r–. 1 root root 630 Sep 5 2019 CentOS-Media.repo
-rw-r–r–. 1 root root 1331 Sep 5 2019 CentOS-Sources.repo
-rw-r–r–. 1 root root 6639 Sep 5 2019 CentOS-Vault.repo
-rw-r–r–. 1 root root 1303 Mar 14 2019 factory.repo
-rw-r–r–. 1 root root 666 Sep 8 10:13 openvz.repo
[root@hypervisor-host2 yum.repos.d]#

[root@hypervisor-host1 yum.repos.d]# ls -al
total 68
drwxr-xr-x. 2 root root 4096 Oct 6 13:13 .
drwxr-xr-x. 112 root root 12288 Oct 7 11:09 ..
-rw-r–r–. 1 root root 1664 Sep 5 2019 CentOS-Base.repo
-rw-r–r–. 1 root root 1309 Sep 5 2019 CentOS-CR.repo
-rw-r–r–. 1 root root 649 Sep 5 2019 CentOS-Debuginfo.repo
-rw-r–r–. 1 root root 314 Sep 5 2019 CentOS-fasttrack.repo
-rw-r–r–. 1 root root 630 Sep 5 2019 CentOS-Media.repo
-rw-r–r–. 1 root root 1331 Sep 5 2019 CentOS-Sources.repo
-rw-r–r–. 1 root root 6639 Sep 5 2019 CentOS-Vault.repo
-rw-r–r–. 1 root root 1303 Mar 14 2019 factory.repo
-rw-r–r–. 1 root root 300 Mar 14 2019 obsoleted_tmpls.repo
-rw-r–r–. 1 root root 666 Sep 8 10:13 openvz.repo

1. Dump VM definition XMs (to have it in case if it gets wiped during update)

There is always a possibility that something will fail during the update and you might be unable to restore back to the old version of the Virtual Machine due to some config misconfigurations or whatever thus a very good idea, before proceeding to modify the working VMs is to use KVM's virsh and dump the exact set of XML configuration that makes the VM roll properly.

To do so:
Check a little bit up in the article how we have listed the IDs that are part of the directory containing the VM.

[root@hypervisor-host1 ]# virsh dumpxml (Id of VM virt-mach-centos1 ) > /root/virt-mach-centos1_config_bak.xml
[root@hypervisor-host2 ]# virsh dumpxml (Id of VM virt-mach-centos2) > /root/virt-mach-centos2_config_bak.xml

2. Set on standby virt-mach-centos1 (virt-mach-centos1)

As I'm upgrading two machines that are configured to run an haproxy corosync cluster, before proceeding to update the active host, we have to switch off
the proxied traffic from node1 to node2, – e.g. standby the active node, so the cluster can move up the traffic to other available node.

[root@virt-mach-centos1 ~]# pcs cluster standby virt-mach-centos1

3. Stop VM virt-mach-centos1 & backup on Hypervisor host (hypervisor-host1) for VM1

Another prevention step to make sure you don't get into damaged VM or broken haproxy cluster after the upgrade is to of course backup

[root@hypervisor-host1 ]# prlctl backup virt-mach-centos1

[root@hypervisor-host1 ]# prlctl stop virt-mach-centos1
[root@hypervisor-host1 ]# cp -rpf /vz/vmprivate/dc37c201-08c9-489d-aa20-9386d63ce3f3 /vz/vmprivate/dc37c201-08c9-489d-aa20-9386d63ce3f3-bak
[root@hypervisor-host1 ]# tar -czvf virt-mach-centos1_vm_virt-mach-centos1.tar.gz /vz/vmprivate/dc37c201-08c9-489d-aa20-9386d63ce3f3

[root@hypervisor-host1 ]# prlctl start virt-mach-centos1

4. Remove package version locks on all hosts

If you're using package locking to prevent some other colleague to not accidently upgrade the machine (if multiple sysadmins are managing the host), you might use the RPM package locking meachanism, if that is used check RPM packs that are locked and release the locking.

+ List actual list of locked packages

[root@hypervisor-host1 ]# yum versionlock list
…
…..
0:libtalloc-2.1.16-1.el7.*
0:libedit-3.0-12.20121213cvs.el7.*
0:p11-kit-trust-0.23.5-3.el7.*
1:quota-nls-4.01-19.el7.*
0:perl-Exporter-5.68-3.el7.*
0:sudo-1.8.23-9.el7.*
0:libxslt-1.1.28-5.el7.*
versionlock list done

+ Clear the locking

# yum versionlock clear

+ List actual list / == clear all entries

[root@virt-mach-centos2 ]# yum versionlock list; yum versionlock clear
[root@virt-mach-centos1 ]# yum versionlock list; yum versionlock clear
[root@hypervisor-host1 ~]# yum versionlock list; yum versionlock clear
[root@hypervisor-host2 ~]# yum versionlock list; yum versionlock clear

5. Do yum update virt-mach-centos1

For some clarity if something goes wrong, it is really a good idea to make a dump of the basic packages installed before the RPM package update is initiated,
The exact versoin of RHEL or CentOS as well as the list of locked packages, if locking is used.

Enter virt-mach-centos1 (ssh virt-mach-centos1) and run following cmds:

# cat /etc/redhat-release > /root/logs/redhat-release-$(hostname)-$(date '+%Y-%m-%d_%H-%M-%S').out
# cat /etc/grub.d/30_os-prober > /root/logs/grub2-efi-$(hostname)-$(date '+%Y-%m-%d_%H-%M-%S').out

+ Only if needed!!

# yum versionlock clear
# yum versionlock list

Clear any previous RPM packages – careful with that as you might want to keep the old RPMs, if unsure comment out below line

# yum clean all |tee /root/logs/yumcleanall-$(hostname)-$(date '+%Y-%m-%d_%H-%M-%S').out

Proceed with the update and monitor closely the output of commands and log out everything inside files using a small script that you should place under /root/status the script is given at the end of the aritcle.:

yum check-update |tee /root/logs/yumcheckupdate-$(hostname)-$(date '+%Y-%m-%d_%H-%M-%S').out
yum check-update | wc -l
yum update |tee /root/logs/yumupdate-$(hostname)-$(date '+%Y-%m-%d_%H-%M-%S').out
sh /root/status |tee /root/logs/status-before-$(hostname)-$(date '+%Y-%m-%d_%H-%M-%S').out

6. Check if everything is running fine after upgrade

Reboot VM

# shutdown -r now

7. Stop VM virt-mach-centos2 & backup on Hypervisor host (hypervisor-host2)

Same backup step as prior

# prlctl backup virt-mach-centos2

# prlctl stop virt-mach-centos2
# cp -rpf /vz/vmprivate/92075803-a4ce-4ec0-a3d8-9ee83d85fc76 /vz/vmprivate/92075803-a4ce-4ec0-a3d8-9ee83d85fc76-bak
## tar -czvf virt-mach-centos2_vm_virt-mach-centos2.tar.gz /vz/vmprivate/92075803-a4ce-4ec0-a3d8-9ee83d85fc76

# prctl start virt-mach-centos2

8. Do yum update on virt-mach-centos2

Log system state, before the update

# cat /etc/redhat-release > /root/logs/redhat-release-vorher-$(hostname)-$(date '+%Y-%m-%d_%H-%M-%S').out
# cat /etc/grub.d/30_os-prober > /root/logs/grub2-efi-vorher-$(hostname)-$(date '+%Y-%m-%d_%H-%M-%S').out
# yum versionlock clear == if needed!!
# yum versionlock list

+ Clean old install update / packages if required

# yum clean all |tee /root/logs/yumcleanall-$(hostname)-$(date '+%Y-%m-%d_%H-%M-%S').out

Initiate the update

# yum check-update |tee /root/logs/yumcheckupdate-$(hostname)-$(date '+%Y-%m-%d_%H-%M-%S').out 2>&1
# yum check-update | wc -l
# yum update |tee /root/logs/yumupdate-$(hostname)-$(date '+%Y-%m-%d_%H-%M-%S').out 2>&1
# sh /root/status |tee /root/logs/status-before-$(hostname)-$(date '+%Y-%m-%d_%H-%M-%S').out

9. Check if everything is running fine after upgrade

Reboot VM

# shutdown -r now

10. Stop VM vm-host2 & backup

# prlctl backup vm-host2

# prlctl stop vm-host2

Or copy the actual directory containig the Virtozzo VM (use the correct ID)

# cp -rpf /vz/vmprivate/76e8a5f8-caa8-5442-830e-aa4bfe8d42d9 /vz/vmprivate/76e8a5f8-caa8-5442-830e-aa4bfe8d42d9-bak
## tar -czvf vm-host2.tar.gz /vz/vmprivate/76e8a5f8-caa8-4442-830e-aa5bfe8d42d9

# prctl start vm-host2

11. Do yum update vm-host2

# cat /etc/redhat-release > /root/logs/redhat-release-vorher-$(hostname)-$(date '+%Y-%m-%d_%H-%M-%S').out
# cat /etc/grub.d/30_os-prober > /root/logs/grub2-efi-vorher-$(hostname)-$(date '+%Y-%m-%d_%H-%M-%S').out

Clear only if needed

# yum versionlock clear
# yum versionlock list
# yum clean all |tee /root/logs/yumcleanall-$(hostname)-$(date '+%Y-%m-%d_%H-%M-%S').out

Do the rpm upgrade

# yum check-update |tee /root/logs/yumcheckupdate-$(hostname)-$(date '+%Y-%m-%d_%H-%M-%S').out
# yum check-update | wc -l
# yum update |tee /root/logs/yumupdate-$(hostname)-$(date '+%Y-%m-%d_%H-%M-%S').out
# sh /root/status |tee /root/logs/status-before-$(hostname)-$(date '+%Y-%m-%d_%H-%M-%S').out

12. Check if everything is running fine after upgrade

Reboot VM

# shutdown -r now

13. Do yum update hypervisor-host2

# cat /etc/redhat-release > /root/logs/redhat-release-vorher-$(hostname)-$(date '+%Y-%m-%d_%H-%M-%S').out
# cat /etc/grub.d/30_os-prober > /root/logs/grub2-efi-vorher-$(hostname)-$(date '+%Y-%m-%d_%H-%M-%S').out

Clear lock if needed

# yum versionlock clear
# yum versionlock list
# yum clean all |tee /root/logs/yumcleanall-$(hostname)-$(date '+%Y-%m-%d_%H-%M-%S').out

Update rpms

# yum check-update |tee /root/logs/yumcheckupdate-$(hostname)-$(date '+%Y-%m-%d_%H-%M-%S').out 2>&1
# yum check-update | wc -l
# yum update |tee /root/logs/yumupdate-$(hostname)-$(date '+%Y-%m-%d_%H-%M-%S').out 2>&1
# sh /root/status |tee /root/logs/status-before-$(hostname)-$(date '+%Y-%m-%d_%H-%M-%S').out

14. Stop VM vm-host1 & backup

Some as ealier

# prlctl backup vm-host1

# prlctl stop vm-host1

# cp -rpf /vz/vmprivate/74a7bbe8-9245-4385-ac0d-d10299100789 /vz/vmprivate/74a7bbe8-9245-4385-ac0d-d10299100789-bak
# tar -czvf vm-host1.tar.gz /vz/vmprivate/74a7bbe8-9245-4385-ac0d-d10299100789

# prctl start vm-host1

15. Do yum update vm-host2

# cat /etc/redhat-release > /root/logs/redhat-release-vorher-$(hostname)-$(date '+%Y-%m-%d_%H-%M-%S').out
# cat /etc/grub.d/30_os-prober > /root/logs/grub2-efi-vorher-$(hostname)-$(date '+%Y-%m-%d_%H-%M-%S').out
# yum versionlock clear == if needed!!
# yum versionlock list
# yum clean all |tee /root/logs/yumcleanall-$(hostname)-$(date '+%Y-%m-%d_%H-%M-%S').out
# yum check-update |tee /root/logs/yumcheckupdate-$(hostname)-$(date '+%Y-%m-%d_%H-%M-%S').out
# yum check-update | wc -l
# yum update |tee /root/logs/yumupdate-$(hostname)-$(date '+%Y-%m-%d_%H-%M-%S').out
# sh /root/status |tee /root/logs/status-before-$(hostname)-$(date '+%Y-%m-%d_%H-%M-%S').out

16. Check if everything is running fine after upgrade

+ Reboot VM

# shutdown -r now

17. Do yum update hypervisor-host1

Same procedure for HV host 1

# cat /etc/redhat-release > /root/logs/redhat-release-vorher-$(hostname)-$(date '+%Y-%m-%d_%H-%M-%S').out
# cat /etc/grub.d/30_os-prober > /root/logs/grub2-efi-vorher-$(hostname)-$(date '+%Y-%m-%d_%H-%M-%S').out

Clear lock

# yum versionlock clear
# yum versionlock list
# yum clean all |tee /root/logs/yumcleanall-$(hostname)-$(date '+%Y-%m-%d_%H-%M-%S').out

# yum check-update |tee /root/logs/yumcheckupdate-$(hostname)-$(date '+%Y-%m-%d_%H-%M-%S').out
# yum check-update | wc -l
# yum update |tee /root/logs/yumupdate-$(hostname)-$(date '+%Y-%m-%d_%H-%M-%S').out
# sh /root/status |tee /root/logs/status-before-$(hostname)-$(date '+%Y-%m-%d_%H-%M-%S').out

18. Check if everything is running fine after upgrade

Reboot VM

# shutdown -r now

Check hypervisor-host1 all VMs run as expected

19. Check if everything is running fine after upgrade

Reboot VM

# shutdown -r now

Check hypervisor-host2 all VMs run as expected afterwards

20. Check once more VMs and haproxy or any other contained services in VMs run as expected

21. Haproxy Unstandby virt-mach-centos1

Assuming that the virt-mach-centos1 and virt-mach-centos2 are running a Haproxy / corosync cluster you can try to standby node1 and check the result
hopefully all should be fine and traffic should come to host node2.

[root@virt-mach-centos1 ~]# pcs cluster unstandby virt-mach-centos1

Monitor logs and make sure HAproxy works fine on virt-mach-centos1

22. If necessery to redefine VMs (in case they disappear from virsh) or virtuosso is not working

[root@virt-mach-centos1 ]# virsh define /root/virt-mach-centos1_config_bak.xml
[root@virt-mach-centos1 ]# virsh define /root/virt-mach-centos2_config_bak.xml

23. Set versionlock to RPMs to prevent accident updates and check OS version release

[root@virt-mach-centos2 ]# yum versionlock \*
[root@virt-mach-centos1 ]# yum versionlock \*
[root@hypervisor-host1 ~]# yum versionlock \*
[root@hypervisor-host2 ~]# yum versionlock \*

[root@hypervisor-host2 ~]# cat /etc/redhat-release
CentOS Linux release 7.8.2003 (Core)

Other useful hints

[root@hypervisor-host1 ~]# virsh console dc37c201-08c9-489d-aa20-9386d63ce3f3
Connected to domain virt-mach-centos1
..

! Compare packages count before the upgrade on each of the supposable identical VMs and HVs – if there is difference in package count review what kind of packages are different and try to make the machines to look as identical as possible !

Packages to update on hypervisor-host1 Count: XXX
Packages to update on hypervisor-host2 Count: XXX
Packages to update virt-mach-centos1 Count: – 254
Packages to update virt-mach-centos2 Count: – 249

The /root/status script

+++

#!/bin/sh
echo '======================================================= '
echo '= Systemctl list-unit-files –type=service | grep enabled '
echo '======================================================= '
systemctl list-unit-files –type=service | grep enabled

echo '======================================================= '
echo '= systemctl | grep ".service" | grep "running" '
echo '======================================================= '
systemctl | grep ".service" | grep "running"

echo '======================================================= '
echo '= chkconfig –list '
echo '======================================================= '
chkconfig –list

echo '======================================================= '
echo '= netstat -tulpn '
echo '======================================================= '
netstat -tulpn

echo '======================================================= '
echo '= netstat -r '
echo '======================================================= '
netstat -r

+++

That's all folks, once going through the article, after some 2 hours of efforts or so you should have an up2date machines.
Any problems faced or feedback is mostly welcome as this might help others who have the same setup.

Thanks for reading me 🙂

Tags: backups, check, cmds, Connected, downtime, echo, Linux Hypervisor Host, list, logs, repo, root root, rpm, Set, setup, updates, virtual machines, yum versionlock
Posted in CentOS, Linux, Linux Package Management, System Administration, Virtual Machines | 1 Comment »

List and fix failed systemd failed services after Linux OS upgrade and how to get full info about systemd service from jorunal log

Friday, February 25th, 2022

I have recently upgraded a number of machines from Debian 10 Buster to Debian 11 Bullseye. The update as always has some issues on some machines, such as problem with package dependencies, changing a number of external package repositories etc. to match che Bullseye deb packages. On some machines the update was less painful on others but the overall line was that most of the machines after the update ended up with one or more failed systemd services. It could be that some of the machines has already had this failed services present and I never checked them from the previous time update from Debian 9 -> Debian 10 or just some mess I've left behind in the hurry when doing software installation in the past. This doesn't matter anyways the fact was that I had to deal to a number of systemctl services which I managed to track by the Failed service mesage on system boot on one of the physical machines and on the OpenXen VTY Console the rest of Virtual Machines after update had some Failed messages. Thus I've spend some good amount of time like an overall of a day or two fixing strange failed services. This is how this small article was born in attempt to help sysadmins or any home Linux desktop users, who has updated his Debian Linux / Ubuntu or any other deb based distribution but due to the chaotic nature of Linux has ended with same strange Failed services and look for a way to find the source of the failures and get rid of the problems.
Systemd is a very complicated system and in my many sysadmin opinion it makes more problems than it solves, but okay for today's people's megalomania mindset it matches well.

1. Check the journal for errors, running service irregularities and so on

First thing to do to track for errors, right after the update is to take some minutes and closely check,, the journalctl for any strange errors, even on well maintained Unix machines, this journal log would bring you to a problem that is not fatal but still some process or stuff is malfunctioning in the background that you would like to solve:

root@pcfreak:~# journalctl -x
Jan 10 10:10:01 pcfreak CRON[17887]: pam_unix(cron:session): session closed for user root
Jan 10 10:10:01 pcfreak audit[17887]: USER_END pid=17887 uid=0 auid=0 ses=340858 subj==unconfined msg='op=PAM:session_close grantors=pam_loginuid,pam_env,pam_env,pam_permit>
Jan 10 10:10:01 pcfreak audit[17888]: CRED_DISP pid=17888 uid=0 auid=0 ses=340860 subj==unconfined msg='op=PAM:setcred grantors=pam_permit acct="root" exe="/usr/sbin/cron" >
Jan 10 10:10:01 pcfreak CRON[17888]: pam_unix(cron:session): session closed for user root
Jan 10 10:10:01 pcfreak audit[17888]: USER_END pid=17888 uid=0 auid=0 ses=340860 subj==unconfined msg='op=PAM:session_close grantors=pam_loginuid,pam_env,pam_env,pam_permit>
Jan 10 10:10:01 pcfreak audit[17884]: CRED_DISP pid=17884 uid=0 auid=0 ses=340855 subj==unconfined msg='op=PAM:setcred grantors=pam_permit acct="root" exe="/usr/sbin/cron" >
Jan 10 10:10:01 pcfreak CRON[17884]: pam_unix(cron:session): session closed for user root
Jan 10 10:10:01 pcfreak audit[17884]: USER_END pid=17884 uid=0 auid=0 ses=340855 subj==unconfined msg='op=PAM:session_close grantors=pam_loginuid,pam_env,pam_env,pam_permit>
Jan 10 10:10:01 pcfreak audit[17886]: CRED_DISP pid=17886 uid=0 auid=33 ses=340859 subj==unconfined msg='op=PAM:setcred grantors=pam_permit acct="www-data" exe="/usr/sbin/c>
Jan 10 10:10:01 pcfreak CRON[17886]: pam_unix(cron:session): session closed for user www-data
Jan 10 10:10:01 pcfreak audit[17886]: USER_END pid=17886 uid=0 auid=33 ses=340859 subj==unconfined msg='op=PAM:session_close grantors=pam_loginuid,pam_env,pam_env,pam_permi>
Jan 10 10:10:08 pcfreak NetworkManager[696]: [1641802208.0899] device (eth1): carrier: link connected
Jan 10 10:10:08 pcfreak kernel: r8169 0000:03:00.0 eth1: Link is Up – 100Mbps/Full – flow control rx/tx
Jan 10 10:10:08 pcfreak kernel: r8169 0000:03:00.0 eth1: Link is Down
Jan 10 10:10:19 pcfreak NetworkManager[696]: [1641802219.7920] device (eth1): carrier: link connected
Jan 10 10:10:19 pcfreak kernel: r8169 0000:03:00.0 eth1: Link is Up – 100Mbps/Full – flow control rx/tx
Jan 10 10:10:20 pcfreak kernel: r8169 0000:03:00.0 eth1: Link is Down
Jan 10 10:10:22 pcfreak NetworkManager[696]: [1641802222.2772] device (eth1): carrier: link connected
Jan 10 10:10:22 pcfreak kernel: r8169 0000:03:00.0 eth1: Link is Up – 100Mbps/Full – flow control rx/tx
Jan 10 10:10:23 pcfreak kernel: r8169 0000:03:00.0 eth1: Link is Down
Jan 10 10:10:33 pcfreak sshd[18142]: Unable to negotiate with 66.212.17.162 port 19255: no matching key exchange method found. Their offer: diffie-hellman-group14-sha1,diff>
Jan 10 10:10:41 pcfreak NetworkManager[696]: [1641802241.0186] device (eth1): carrier: link connected
Jan 10 10:10:41 pcfreak kernel: r8169 0000:03:00.0 eth1: Link is Up – 100Mbps/Full – flow control rx/tx

If you want to only check latest journal log messages use the -x -e (pager catalog) opts

root@pcfreak;~# journalctl -xe
…
Feb 25 13:08:29 pcfreak audit[2284920]: USER_LOGIN pid=2284920 uid=0 auid=4294967295 ses=4294967295 subj==unconfined msg='op=login acct=28696E76616C>
Feb 25 13:08:29 pcfreak sshd[2284920]: Received disconnect from 177.87.57.145 port 40927:11: Bye Bye [preauth]
Feb 25 13:08:29 pcfreak sshd[2284920]: Disconnected from invalid user ubuntuuser 177.87.57.145 port 40927 [preauth]

Next thing to after the update was to get a list of failed service only.

2. List all systemd failed check services which was supposed to be running

root@pcfreak:/root # systemctl list-units | grep -i failed
● certbot.service loaded failed failed Certbot
● logrotate.service loaded failed failed Rotate log files
● maldet.service loaded failed failed LSB: Start/stop maldet in monitor mode
● named.service loaded failed failed BIND Domain Name Server

Alternative way is with the –failed option

hipo@jeremiah:~$ systemctl list-units –failed
UNIT LOAD ACTIVE SUB DESCRIPTION
● haproxy.service loaded failed failed HAProxy Load Balancer
● libvirt-guests.service loaded failed failed Suspend/Resume Running libvirt Guests
● libvirtd.service loaded failed failed Virtualization daemon
● nvidia-persistenced.service loaded failed failed NVIDIA Persistence Daemon
● sqwebmail.service masked failed failed sqwebmail.service
● tpm2-abrmd.service loaded failed failed TPM2 Access Broker and Resource Management Daemon
● wd_keepalive.service loaded failed failed LSB: Start watchdog keepalive daemon

LOAD = Reflects whether the unit definition was properly loaded.
ACTIVE = The high-level unit activation state, i.e. generalization of SUB.
SUB = The low-level unit activation state, values depend on unit type.
7 loaded units listed.

root@jeremiah:/etc/apt/sources.list.d# systemctl list-units –failed
UNIT LOAD ACTIVE SUB DESCRIPTION
● haproxy.service loaded failed failed HAProxy Load Balancer
● libvirt-guests.service loaded failed failed Suspend/Resume Running libvirt Guests
● libvirtd.service loaded failed failed Virtualization daemon
● nvidia-persistenced.service loaded failed failed NVIDIA Persistence Daemon
● sqwebmail.service masked failed failed sqwebmail.service
● tpm2-abrmd.service loaded failed failed TPM2 Access Broker and Resource Management Daemon
● wd_keepalive.service loaded failed failed LSB: Start watchdog keepalive daemon

To get a full list of objects of systemctl you can pass as state:

# systemctl –state=help
Full list of possible load states to pass is here
Show service properties

Check whether a service is failed or has other status and check default set systemd variables for it.

root@jeremiah~:# systemctl is-failed vboxweb.service
inactive

# systemctl show haproxy
Type=notify
Restart=always
NotifyAccess=main
RestartUSec=100ms
TimeoutStartUSec=1min 30s
TimeoutStopUSec=1min 30s
TimeoutAbortUSec=1min 30s
TimeoutStartFailureMode=terminate
TimeoutStopFailureMode=terminate
RuntimeMaxUSec=infinity
WatchdogUSec=0
WatchdogTimestampMonotonic=0
RootDirectoryStartOnly=no
RemainAfterExit=no
GuessMainPID=yes
SuccessExitStatus=143
MainPID=304858
ControlPID=0
FileDescriptorStoreMax=0
NFileDescriptorStore=0
StatusErrno=0
Result=success
ReloadResult=success
CleanResult=success
…

Full output of the above command is dumped in show_systemctl_properties.txt

3. List all running systemd services for a better overview on what's going on on machine

To get a list of all properly systemd loaded services you can use –state running.

hipo@jeremiah:~$ systemctl list-units –state running|head -n 10
UNIT LOAD ACTIVE SUB DESCRIPTION
proc-sys-fs-binfmt_misc.automount loaded active running Arbitrary Executable File Formats File System Automount Point
cups.path loaded active running CUPS Scheduler
init.scope loaded active running System and Service Manager
session-2.scope loaded active running Session 2 of user hipo
accounts-daemon.service loaded active running Accounts Service
anydesk.service loaded active running AnyDesk
apache-htcacheclean.service loaded active running Disk Cache Cleaning Daemon for Apache HTTP Server
apache2.service loaded active running The Apache HTTP Server
avahi-daemon.service loaded active running Avahi mDNS/DNS-SD Stack

It is useful thing is to list all unit-files configured in systemd and their state, you can do it with:

root@pcfreak:~# systemctl list-unit-files
UNIT FILE STATE VENDOR PRESET
proc-sys-fs-binfmt_misc.automount static –
-.mount generated –
backups.mount generated –
dev-hugepages.mount static –
dev-mqueue.mount static –
media-cdrom0.mount generated –
mnt-sda1.mount generated –
proc-fs-nfsd.mount static –
proc-sys-fs-binfmt_misc.mount disabled disabled
run-rpc_pipefs.mount static –
sys-fs-fuse-connections.mount static –
sys-kernel-config.mount static –
sys-kernel-debug.mount static –
sys-kernel-tracing.mount static –
var-www.mount generated –
acpid.path masked enabled
cups.path enabled enabled

root@pcfreak:~# systemctl list-units –type service –all
UNIT LOAD ACTIVE SUB DESCRIPTION
accounts-daemon.service loaded inactive dead Accounts Service
acct.service loaded active exited Kernel process accounting
● alsa-restore.service not-found inactive dead alsa-restore.service
● alsa-state.service not-found inactive dead alsa-state.service
apache2.service loaded active running The Apache HTTP Server
● apparmor.service not-found inactive dead apparmor.service
apt-daily-upgrade.service loaded inactive dead Daily apt upgrade and clean activities
apt-daily.service loaded inactive dead Daily apt download activities
atd.service loaded active running Deferred execution scheduler
auditd.service loaded active running Security Auditing Service
auth-rpcgss-module.service loaded inactive dead Kernel Module supporting RPCSEC_GSS
avahi-daemon.service loaded active running Avahi mDNS/DNS-SD Stack
certbot.service loaded inactive dead Certbot
clamav-daemon.service loaded active running Clam AntiVirus userspace daemon
clamav-freshclam.service loaded active running ClamAV virus database updater
..

4. Finding out more on why a systemd configured service has failed

Usually getting info about failed systemd service is done with systemctl status servicename.service
However, in case of troubles with service unable to start to get more info about why a service has failed with (-l) or (–full) options

root@pcfreak:~# systemctl -l status logrotate.service
● logrotate.service – Rotate log files
Loaded: loaded (/lib/systemd/system/logrotate.service; static)
Active: failed (Result: exit-code) since Fri 2022-02-25 00:00:06 EET; 13h ago
TriggeredBy: ● logrotate.timer
Docs: man:logrotate(8)
man:logrotate.conf(5)
Process: 2045320 ExecStart=/usr/sbin/logrotate /etc/logrotate.conf (code=exited, status=1/FAILURE)
Main PID: 2045320 (code=exited, status=1/FAILURE)
CPU: 2.479s

Feb 25 00:00:06 pcfreak logrotate[2045577]: 2022/02/25 00:00:06| WARNING: For now we will assume you meant to write /32
Feb 25 00:00:06 pcfreak logrotate[2045577]: 2022/02/25 00:00:06| ERROR: '0.0.0.0/0.0.0.0' needs to be replaced by the term 'all'.
Feb 25 00:00:06 pcfreak logrotate[2045577]: 2022/02/25 00:00:06| SECURITY NOTICE: Overriding config setting. Using 'all' instead.
Feb 25 00:00:06 pcfreak logrotate[2045577]: 2022/02/25 00:00:06| WARNING: (B) '::/0' is a subnetwork of (A) '::/0'
Feb 25 00:00:06 pcfreak logrotate[2045577]: 2022/02/25 00:00:06| WARNING: because of this '::/0' is ignored to keep splay tree searching predictable
Feb 25 00:00:06 pcfreak logrotate[2045577]: 2022/02/25 00:00:06| WARNING: You should probably remove '::/0' from the ACL named 'all'
Feb 25 00:00:06 pcfreak systemd[1]: logrotate.service: Main process exited, code=exited, status=1/FAILURE
Feb 25 00:00:06 pcfreak systemd[1]: logrotate.service: Failed with result 'exit-code'.
Feb 25 00:00:06 pcfreak systemd[1]: Failed to start Rotate log files.
Feb 25 00:00:06 pcfreak systemd[1]: logrotate.service: Consumed 2.479s CPU time.

systemctl -l however is providing only the last log from message a started / stopped or whatever status service has generated. Sometimes systemctl -l servicename.service is showing incomplete the splitted error message as there is a limitation of line numbers on the console, see below

root@pcfreak:~# systemctl status -l certbot.service
● certbot.service – Certbot
Loaded: loaded (/lib/systemd/system/certbot.service; static)
Active: failed (Result: exit-code) since Fri 2022-02-25 09:28:33 EET; 4h 0min ago
TriggeredBy: ● certbot.timer
Docs: file:///usr/share/doc/python-certbot-doc/html/index.html
https://certbot.eff.org/docs
Process: 290017 ExecStart=/usr/bin/certbot -q renew (code=exited, status=1/FAILURE)
Main PID: 290017 (code=exited, status=1/FAILURE)
CPU: 9.771s

Feb 25 09:28:33 pcfrxen certbot[290017]: The error was: PluginError('An authentication script must be provided with –manual-auth-hook when using th>
Feb 25 09:28:33 pcfrxen certbot[290017]: All renewals failed. The following certificates could not be renewed:
Feb 25 09:28:33 pcfrxen certbot[290017]: /etc/letsencrypt/live/mail.pcfreak.org-0003/fullchain.pem (failure)
Feb 25 09:28:33 pcfrxen certbot[290017]: /etc/letsencrypt/live/www.eforia.bg-0005/fullchain.pem (failure)
Feb 25 09:28:33 pcfrxen certbot[290017]: /etc/letsencrypt/live/zabbix.pc-freak.net/fullchain.pem (failure)
Feb 25 09:28:33 pcfrxen certbot[290017]: 3 renew failure(s), 5 parse failure(s)
Feb 25 09:28:33 pcfrxen systemd[1]: certbot.service: Main process exited, code=exited, status=1/FAILURE
Feb 25 09:28:33 pcfrxen systemd[1]: certbot.service: Failed with result 'exit-code'.
Feb 25 09:28:33 pcfrxen systemd[1]: Failed to start Certbot.
Feb 25 09:28:33 pcfrxen systemd[1]: certbot.service: Consumed 9.771s CPU time.

5. Get a complete log of journal to make sure everything configured on server host runs as it should

Thus to get more complete list of the message and be able to later google and look if has come with a solution on the internet use:

root@pcfrxen:~# journalctl –catalog –unit=certbot

— Journal begins at Sat 2022-01-22 21:14:05 EET, ends at Fri 2022-02-25 13:32:01 EET. —
Jan 23 09:58:18 pcfrxen systemd[1]: Starting Certbot…
░░ Subject: A start job for unit certbot.service has begun execution
░░ Defined-By: systemd
░░ Support: https://www.debian.org/support
░░
░░ A start job for unit certbot.service has begun execution.
░░
░░ The job identifier is 5754.
Jan 23 09:58:20 pcfrxen certbot[124996]: Traceback (most recent call last):
Jan 23 09:58:20 pcfrxen certbot[124996]: File "/usr/lib/python3/dist-packages/certbot/_internal/renewal.py", line 71, in _reconstitute
Jan 23 09:58:20 pcfrxen certbot[124996]: renewal_candidate = storage.RenewableCert(full_path, config)
Jan 23 09:58:20 pcfrxen certbot[124996]: File "/usr/lib/python3/dist-packages/certbot/_internal/storage.py", line 471, in __init__
Jan 23 09:58:20 pcfrxen certbot[124996]: self._check_symlinks()
Jan 23 09:58:20 pcfrxen certbot[124996]: File "/usr/lib/python3/dist-packages/certbot/_internal/storage.py", line 537, in _check_symlinks

root@server:~# journalctl –catalog –unit=certbot|grep -i pluginerror|tail -1
Feb 25 09:28:33 pcfrxen certbot[290017]: The error was: PluginError('An authentication script must be provided with –manual-auth-hook when using the manual plugin non-interactively.')

Or if you want to list and read only the last messages in the journal log regarding a service

root@server:~# journalctl –catalog –pager-end –unit=certbot
…

If you have disabled a failed service because you don't need it to run at all on the machine with:

root@rhel:~# systemctl stop rngd.service
root@rhel:~# systemctl disable rngd.service

And you want to clear up any failed service information that is kept in the systemctl service log you can do it with:

root@rhel:~# systemctl reset-failed

Another useful systemctl option is cat, you can use it to easily list a service it is useful to quickly check what is a service, an actual shortcut to save you from giving a full path to the service e.g. cat /lib/systemd/system/certbot.service

root@server:~# systemctl cat certbot
# /lib/systemd/system/certbot.service
[Unit]
Description=Certbot
Documentation=file:///usr/share/doc/python-certbot-doc/html/index.html
Documentation=https://certbot.eff.org/docs
[Service]
Type=oneshot
ExecStart=/usr/bin/certbot -q renew
PrivateTmp=true

After failed SystemD services are fixed, it is best to reboot the machine and check put some more time to inspect rawly the complete journal log to make sure, no error was left behind.

Closure

As you can see updating a machine from a major to a major version even if you follow the official documentation and you have plenty of experience is always more or a less a pain in the ass, which can eat up much of your time banging your head solving problems with failed daemons issues with /etc/rc.local (which I have faced becase of #/bin/sh -e (which would make /etc/rc.local) to immediately quit if any error from command $? returns different from 0 etc.. The logical questions comes then;
1. Is it really worthy to update at all regularly, especially if you don't know of a famous major Vulnerability 🙂 ?
2. Or is it worthy to update from OS major release to OS major release at all?
3. Or should you only try to patch the service that is exposed to an external reachable computer network or the internet only and still the the same OS release until End of Life (LTS = Long Term Support) as called in Debian or End Of Life (EOL) Cycle as called in RPM based distros the period until the OS major release your software distro has official security patches is reached.

Anyone could take any approach but for my own managed systems small network at home my practice was always to try to keep up2date everything every 3 or 6 months maximum. This has caused me multiple days of irritation and stress and perhaps many white hairs and spend nerves on shit.

4. Based on the company where I'm employed the better strategy is to patch to the EOL is still offered and keep the rule First Things First (FTF), once the EOL is reached, just make a copy of all servers data and configuration to external Data storage, bring up a new Physical or VM and migrate the services.
Test after the migration all works as expected if all is as it should be change the DNS records or Leading Infrastructure Proxies whatever to point to the new service and that's it! Yes it is true that migration based on a full OS reinstall is more time consuming and requires much more planning, but usually the result is much more expected, plus it is much less stressful for the guy doing the job.

Tags: active, check, daemon, daemons, fix, fixing, haproxy, hipo, Journal, journalctl, kernel, libvirtd.service, list, log, log messages, lsb, Main, msg, PAM, pcfreak, properly, rc.local, root, root server, strange errors, Suspend Resume Running, systemctl, systemd, uid
Posted in Linux, Migration, OS Update, System Administration | No Comments »

List all existing local admin users belonging to admin group and mail them to monitoring mail box

Monday, February 8th, 2021

local-user-account-creation-deletion-change-monitor-accounts-and-send-them-to-central-monitoring-mail

If you have a bunch of servers that needs to have a tight security with multiple Local users superuser accounts that change over time and you need to frequently keep an have a long over time if some new system UNIX local users in /etc/passwd /etc/group has been added deleted e.g. the /etc/passwd /etc/group then you might have the task to have some primitive monitoring set and the most primitive I can think of is simply routinely log users list for historical purposes to a common mailbox over time (lets say 4 times a month or even more frequently) you might send with a simple cron job a list of all existing admin authorized users to a logging sysadmin mailbox like lets say:

Local-unix-users@yourcompanydomain.com

A remark to make here is the common sysadmin practice scenario to have local existing non-ldap admin users group members of whom are authorized to use sudo su – root via /etc/sudoers is described in my previous article how to add local users to admin group superuser access via sudo I thus have been managing already a number of servers that have user setup using the above explained admin group.

Thus to have the monitoring at place I've developed a tiny shell script that does check all users belonging to the predefined user group dumps it to .csv format that starts with a simple timestamp on when user admin list was made and sends it to a predefined email address as well as logs sent mail content for further reference in a local directory.

The task is a relatively easy but since nowadays the level of competency of system administration across youngsters is declinging -that's of course in my humble opinion (just like it happens in every other profession), below is the developed list-admin-users.sh

#!/bin/bash
# dump all users belonging to a predefined admin user / group in csv format
# with a day / month year timestamp and mail it to a predefined admin
# monitoring address
TO_ADDRESS="Local-unix-users@yourcompanydomain.com";
HOSTN=$(hostname);
# root@server:/# grep -i 1000 /etc/passwd
# username:x:username:1000:username,,,:/home/username:/bin/bash
# username1:x:username1:1000:username1,,,:/home/username1:/bin/bash
# username5:x:username1:1000:username5,,,:/home/username5:/bin/bash

ADMINS_ID='4355';
#
# root@server # group_id_ID='4355'; grep -i group_id_ID /etc/passwd
# …
# username1:x:1005:4355:username1,,,:/home/username1:/bin/bash
# username5:x:1005:4355,,,:/home/username5:/bin/bash

group_id_ID='215';
group_id='group_id';
FIL="/var/log/userlist-log-dir/userlist_$(date +"%d_%m_%Y_%H_%M").txt";
CUR_D="$HOSTN: Current admin users $(date)"; >> $FIL; echo -e "##### $CUR_D #####" >> $FIL;
for i in $(cat /etc/passwd | grep -i /home|grep /bin/bash|grep -e "$ADMINS_ID" -e "$group_id_ID" | cut -d : -f1); do \
if [[ $(grep $i /etc/group|grep $group_id) ]]; then
f=$(echo $i); echo $i,group_id,$(id -g $i); else echo $i,admin,$(id -g $i);
fi
done >> $FIL; mail -s "$CUR_D" $TO_ADDRESS < $FIL

list-admin-users.sh is ready for download also here

To make the script report you will have to place it somewhere for example in /usr/local/bin/list-admin-users.sh , create its log dir location /var/log/userlist-log-dir/ and set proper executable and user/group script and directory permissions to it to be only readable for root user.

root@server: # mkdir /var/log/userlist-log-dir/
root@server: # chmod +x /usr/local/bin/list-admin-users.sh
root@server: # chmod -R 700 /var/log/userlist-log-dir/

To make the script generate its admin user reports and send it to the central mailbox a couple of times in the month early in the morning (assuming you have a properly running postfix / qmail / sendmail … smtp), as a last step you need to set a cron job to routinely invoke the script as root user.

root@server: # crontab -u root -e
12 06 5,10,15,20,25,1 /usr/local/bin/list-admin-users.sh

That's all folks now on 5th 10th, 15th, 20th 25th and 1st at 06:12 you'll get the admin user list reports done. Enjoy 🙂

Tags: admin, Group, list, list-admin-users.shsudoers, list-user-admins.sh, mail, report, script, sudo
Posted in Linux, System Administration, Various | No Comments »

How to check Microsoft IIS webserver version

Monday, July 21st, 2014

If you have to tune some weirdly behaviour Microsoft IIS (Internet Information Services) webserver, the first thing to do is to collect information about the system you're dealing with – get version of installed Windows and check what kind of IIS version is running on the Windows server?

To get the version of installed Windows on the system you just logged in, the quickest way I use is:

Start -> My Computer (right mouse button) Properties

check-windows-server-version-screenshot-windows-2003-r2

Run regedit from cmd.exe and go and check value of registry value:

HKEY_LOCAL_MACHINE\SOFTWARE\MicrosoftInetStp\VersionString

check-iis-webserver-version-with-windows-registry-screenshot

As you can see in screenshot in this particular case it is IIS version 6.0.

An alternative way to check the IIS version in some cases (if IIS version return is not disabled) is to telnet to webserver:

telnet your-webserver 80

Once connected Send:

HEAD / HTTP/1.0

Also on some Windows versions it is possible to check IIS webserver version from Internet Information Services Management Cosnole:

To check IIS version from IIS Manager:

Start (button) -> Control Panel -> Administrative Tools -> "Internet Information Services" IIS Manager

From IIS Manager go to:

Help -> About Microsoft Management Console

Here is a list with most common IIS version output you will get depending on the version of Windows server:

Windows NT 3.51 1.0
Windows NT 4 2.0-4.0
Windows Server 2000 5.0
Windows XP Professional 5.1
Windows Server 2003 6.0
Windows Vista 7.0
Windows Server 2008 7.0
Windows Server 2008 R2 7.5
Windows 7 7.5
Windows Server 2012 8.0
Windows 8 8.0
Windows Server 2012 R2 8.5
Windows 8.1 8.5

If you have only an upload FTP access to a Folder served by IIS Webserver – i.e. no access to the Win server running IIS, you can also grasp the IIS version with following .ASP code:

<%
response.write(Request.ServerVariables("SERVER_SOFTWARE"))
%>

Save the file as anyfile.asp somewhere in IIS docroot and invoke it in browser.

Tags: case, check, How to, IIS, Internet Information Services Management Cosnole, kind, list, mouse, response, running, system, thing, version, Windows, windows server, Windows Vista
Posted in Everyday Life, IIS, Various, Web and CMS, Windows | 2 Comments »

Reinstall all Debian packages with a copy of apt deb package list from another working Debian Linux installation

Wednesday, July 29th, 2020

Reinstall-all-Debian-packages-with-copy-of-apt-packages-list-from-another-working-Debian-Linux-installation

Few days ago, in the hurry in the small hours of the night, I've done something extremely stupid. Wanting to move out a .tar.gz binary copy of qmail installation to /var/lib/qmail with all the dependent qmail items instead of extracting to admin user /root directory (/root), I've extracted it to the main Operating system root / directrory.
Not noticing this, I've quickly executed rm -rf var with the idea to delete all directory tree under /root/var just 3 seconds later, I've realized I'm issuing the rm -rf var with the wrong location WITH a root user !!!! Being scared on what I've done, I've quickly pressed CTRL+C to immedately cancel the deletion operation of my /var.

But as you can guess, since the machine has an Slid State Drive drive and SSD memory drive are much more faster in I/O operations than the classical ATA / SATA disks. I was not quick enough to cancel the operation and I've noticed already some part of my /var have been R.I.P-pped in the heaven of directories.

This was ofcourse upsetting so for a while I rethinked the situation to get some ideas on what I can do to recover my system ASAP!!! and I had the idea of course to try to reinstall All my installed .deb debian packages to restore system closest to the normal, before my stupid mistake.

Guess my unpleasent suprise when I have realized dpkg and respectively apt-get apt and aptitude package management tools cannot anymore handle packages as Debian Linux's package dependency database has been damaged due to missing dpkg directory

/var/lib/dpkg

Oh man that was unpleasent, especially since I've installed plenty of stuff that is custom on my Mate based desktop and, generally reinstalling it updating the sytem to the latest Debian security updates etc. will be time consuming and painful process I wanted to omit.

So of course the logical thing to do here was to try to somehow recover somehow a database copy of /var/lib/dpkg if that was possible, that of course led me to the idea to lookup for a way to recover my /var/lib/dpkg from backup but since I did not maintained any backup copy of my OS anywhere that was not really possible, so anyways I wondered whether dpkg does not keep some kind of database backups somewhere in case if something goes wrong with its database.
This led me to this nice Ubuntu thred which has pointed me to the part of my root rm -rf dpkg db disaster recovery solution.
Luckily .deb package management creators has thought about situation similar to mine and to give the user a restore point for /var/lib/dpkg damaged database

/var/lib/dpkg is periodically backed up in /var/backups

A typical /var/lib/dpkg on Ubuntu and Debian Linux looks like so:

hipo@jeremiah:/var/backups$ ls -l /var/lib/dpkg
total 12572
drwxr-xr-x 2 root root 4096 Jul 26 03:22 alternatives
-rw-r–r– 1 root root 11 Oct 14 2017 arch
-rw-r–r– 1 root root 2199402 Jul 25 20:04 available
-rw-r–r– 1 root root 2199402 Oct 19 2017 available-old
-rw-r–r– 1 root root 8 Sep 6 2012 cmethopt
-rw-r–r– 1 root root 1337 Jul 26 01:39 diversions
-rw-r–r– 1 root root 1223 Jul 26 01:39 diversions-old
drwxr-xr-x 2 root root 679936 Jul 28 14:17 info
-rw-r—– 1 root root 0 Jul 28 14:17 lock
-rw-r—– 1 root root 0 Jul 26 03:00 lock-frontend
drwxr-xr-x 2 root root 4096 Sep 17 2012 parts
-rw-r–r– 1 root root 1011 Jul 25 23:59 statoverride
-rw-r–r– 1 root root 965 Jul 25 23:59 statoverride-old
-rw-r–r– 1 root root 3873710 Jul 28 14:17 status
-rw-r–r– 1 root root 3873712 Jul 28 14:17 status-old
drwxr-xr-x 2 root root 4096 Jul 26 03:22 triggers
drwxr-xr-x 2 root root 4096 Jul 28 14:17 updates

Before proceeding with this radical stuff to move out /var/lib/dpkg/info from another machine to /var mistakenyl removed oned. I have tried to recover with the well known:

extundelete
foremost
recover
ext4magic
ext3grep
gddrescue
ddrescue
myrescue
testdisk
photorec

Linux file deletion recovery tools from a USB stick loaded with a Number of LiveCD distributions, i.e. tested recovery with:

Debian LiveCD
Ubuntu LiveCD
KNOPPIX
SystemRescueCD
Trinity Rescue Kit
Ultimate Boot CD

but unfortunately none of them couldn't recover the deleted files …

The reason why the standard file recovery tools could not recover ?

My assumptions is after I've done by rm -rf var; from sysroot, issued the sync (- if you haven't used it check out man sync) command – that synchronizes cached writes to persistent storage and did a restart from the poweroff PC button, this should have worked, as I've recovered like that in the past) in a normal Sys V System with a normal old fashioned blocks filesystem as EXT2 . or any other of the filesystems without a journal, however as the machine run a EXT4 filesystem with a journald and journald, this did not work perhaps because something was not updated properly in /lib/systemd/systemd-journal, that led to the situation all recently deleted files were totally unrecoverable.

1. First step was to restore the directory skele of /var/lib/dpkg

# mkdir -p /var/lib/dpkg/{alternatives,info,parts,triggers,updates}

2. Recover missing /var/lib/dpkg/status file

The main file that gives information to dpkg of the existing packages and their statuses on a Debian based systems is /var/lib/dpkg/status

# cp /var/backups/dpkg.status.0 /var/lib/dpkg/status

3. Reinstall dpkg package manager to make package management working again

Say a warm prayer to the Merciful God ! and do:

# apt-get download dpkg
# dpkg -i dpkg*.deb

4. Reinstall base-files .deb package which provides basis of a Debian system

Hopefully everything will be okay and your dpkg / apt pair will be in normal working state, next step is to:

# apt-get download base-files
# dpkg -i base-files*.deb

5. Do a package sanity and consistency check and try to update OS package list

Check whether packages have been installed only partially on your system or that have missing, wrong or obsolete control data or files. dpkg should suggest what to do with them to get them fixed.

# dpkg –audit

Then resynchronize (fetch) the package index files from their sources described in /etc/apt/sources.list

# apt-get update

Do apt db constistency check:

# apt-get check

check is a diagnostic tool; it updates the package cache and checks for broken dependencies.

Take a deep breath ! …

Do :

ls -l /var/lib/dpkg
and compare with the above list. If some -old file is not present don't worry it will be there tomorrow.

Next time don't forget to do a regular backup with simple rsync backup script or something like Bacula / Amanda / Time Vault or Clonezilla.

6. Copy dpkg database from another Linux system that has a working dpkg / apt Database

Well this was however not the end of the story … There were still many things missing from my /var/ and luckily I had another Debian 10 Buster install on another properly working machine with a similar set of .deb packages installed. Therefore to make most of my programs still working again I have copied over /var from the other similar set of package installed machine to my messed up machine with the missing deleted /var.

To do so …
On Functioning Debian 10 Machine (Working Host in a local network with IP 192.168.0.50), I've archived content of /var:

linux:~# tar -czvf var_backup_debian10.tar.gz /var
…

Then sftped from Working Host towards the /var deleted broken one in my case this machine's hostname is jericho and luckily still had SSHD and SFTP running processes loaded in memory:

jericho:~# sftp root@192.168.0.50
sftp> get var_backup_debian10.tar.gz
…

Now Before extracting the archive it is a good idea to make backup of old /var remains somewhere for example somewhere in /root
just in case if we need to have a copy of the dpkg backup dir /var/backups

jericho:~# cp -rpfv /var /root/var_backup_damaged
…

jericho:~# tar -zxvf /root/var_backup_debian10.tar.gz
jericho:/# mv /root/var/ /
…

Then to make my /var/lib/dpkg contain the list of packages from my my broken Linux install I have ovewritten /var/lib/dpkg with the files earlier backupped before .tar.gz was extracted.

jericho:~# cp -rpfv /var /root/var_backup_damaged/lib/dpkg/ /var/lib/

7. Reinstall All Debian Packages completely scripts

I then tried to reinstall each and every package first using aptitude with aptitude this is done with

# aptitude reinstall '~i'

However as this failed, tried using a simple shell loop like below:

for i in $(dpkg -l |awk '{ print $2 }'); do echo apt-get install –reinstall –yes $i; done

Alternatively, all .deb package reninstall is also possible with dpkg –get-selections and with awk with below cmds:

dpkg –get-selections | grep -v deinstall | awk '{print $1}' > list.log;
awk '$1=$1' ORS=' ' list.log > newlist.log;
apt-get install –reinstall $(cat newlist.log)

It can also be run as one liner for simplicity:

dpkg –get-selections | grep -v deinstall | awk '{print $1}' > list.log; awk '$1=$1' ORS=' ' list.log > newlist.log; apt-get install –reinstall $(cat newlist.log)

This produced a lot of warning messages, reporting "package has no files currently installed" (virtually for all installed packages), indicating a severe packages problem below is sample output produced after each and every package reinstall … :

dpkg: warning: files list file for package 'iproute' missing; assuming package has no files currently installed
dpkg: warning: files list file for package 'brscan-skey' missing; assuming package has no files currently installed
dpkg: warning: files list file for package 'libapache2-mod-php7.4' missing; assuming package has no files currently installed
dpkg: warning: files list file for package 'libexpat1:amd64' missing; assuming package has no files currently installed
dpkg: warning: files list file for package 'libexpat1:i386' missing; assuming package has no files currently installed
dpkg: warning: files list file for package 'php5.6-readline' missing; assuming package has no files currently installed
dpkg: warning: files list file for package 'linux-headers-4.19.0-5-amd64' missing; assuming package has no files currently installed
dpkg: warning: files list file for package 'libgraphite2-3:amd64' missing; assuming package has no files currently installed
dpkg: warning: files list file for package 'libgraphite2-3:i386' missing; assuming package has no files currently installed
dpkg: warning: files list file for package 'libbonoboui2-0:amd64' missing; assuming package has no files currently installed
dpkg: warning: files list file for package 'libxcb-dri3-0:amd64' missing; assuming package has no files currently installed
dpkg: warning: files list file for package 'libxcb-dri3-0:i386' missing; assuming package has no files currently installed
dpkg: warning: files list file for package 'liblcms2-2:amd64' missing; assuming package has no files currently installed
dpkg: warning: files list file for package 'liblcms2-2:i386' missing; assuming package has no files currently installed
dpkg: warning: files list file for package 'libpixman-1-0:amd64' missing; assuming package has no files currently installed
dpkg: warning: files list file for package 'libpixman-1-0:i386' missing; assuming package has no files currently installed
dpkg: warning: files list file for package 'gksu' missing; assuming package has no files currently installed
dpkg: warning: files list file for package 'liblogging-stdlog0:amd64' missing; assuming package has no files currently installed
dpkg: warning: files list file for package 'mesa-vdpau-drivers:amd64' missing; assuming package has no files currently installed
dpkg: warning: files list file for package 'mesa-vdpau-drivers:i386' missing; assuming package has no files currently installed
dpkg: warning: files list file for package 'libzvbi0:amd64' missing; assuming package has no files currently installed
dpkg: warning: files list file for package 'libzvbi0:i386' missing; assuming package has no files currently installed
dpkg: warning: files list file for package 'libcdparanoia0:amd64' missing; assuming package has no files currently installed
dpkg: warning: files list file for package 'libcdparanoia0:i386' missing; assuming package has no files currently installed
dpkg: warning: files list file for package 'python-gconf' missing; assuming package has no files currently installed
dpkg: warning: files list file for package 'php5.6-cli' missing; assuming package has no files currently installed
dpkg: warning: files list file for package 'libpaper1:amd64' missing; assuming package has no files currently installed
dpkg: warning: files list file for package 'mixer.app' missing; assuming package has no files currently installed
…

After some attempts I found a way to be able to work around the warning message, for each package by simply reinstalling the package reporting the issue with

apt –reinstall $package_name

Though reinstallation started well and many packages got reinstalled, unfortunately some packages such as apache2-mod-php5.6 and other php related ones started failing during reinstall ending up in unfixable states right after installation of binaries from packages was successfully placed in its expected locations on disk. The failures occured during the package setup stage ( dpkg –configure $packagename) …

The logical thing to do is a recovery attempt with something like the usual well known by any Debian admin:

apt-get install –fix-missing

As well as Manual requesting to reconfigure (issue re-setup) of all installed packages also did not produced a positive result

dpkg –configure -a
…

But many packages were still failing due to dpkg inability to execute some post installation scripts from respective .deb files.
To work around that and continue installing the rest of packages I had to manually delete all files related to the failing package located under directory

/var/lib/dpkg/info#

For example to omit the post installation failure of libapache2-mod-php5.6 and have a succesful install of the package next time I tried reinstall, I had to delete all /var/lib/dpkg/info/libapache2-mod-php5.6.postrm, /var/lib/dpkg/info/libapache2-mod-php5.6.postinst scripts and even sometimes everything like libapache2-mod-php5.6* that were present in /var/lib/dpkg/info dir.

The problem with this solution, however was the package reporting to install properly, but the post install script hooks were still not in placed and important things as setting permissions of binaries after install or applying some configuration changes right after install was missing leading to programs failing to fully behave properly or even breaking up even though showing as finely installed …

The final solution to this problem was radical.
I've used /var/lib/dpkg database (directory) from ther other working Linux machine with dpkg DB OK found in var_backup_debian10.tar.gz (linux:~# host with a working dpkg database) and then based on the dpkg package list correct database responding on jericho:~# to reinstall each and every package on the system using Debian System Reinstaller script taken from the internet.
Debian System Reinstaller works but to reinstall many packages, I've been prompted again and again whether to overwrite configuration or keep the present one of packages.
To Omit the annoying [Y / N ] text prompts I had made a slight modification to the script so it finally looked like this:

#!/bin/bash
# Debian System Reinstaller
# Copyright (C) 2015 Albert Huang
# Copyright (C) 2018 Andreas Fendt
#
# This program is free software: you can redistribute it and/or modify
# it under the terms of the GNU General Public License as published by
# the Free Software Foundation, either version 3 of the License, or
# (at your option) any later version.
#
# This program is distributed in the hope that it will be useful,
# but WITHOUT ANY WARRANTY; without even the implied warranty of
# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
# GNU General Public License for more details.
#
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <http://www.gnu.org/licenses/>.
#
# —
# This script assumes you are using a Debian based system
# (Debian, Mint, Ubuntu, #!), and have sudo installed. If you don't
# have sudo installed, replace "sudo" with "su -c" instead.
#

pkgs=`dpkg –get-selections | grep -w 'install$' | cut -f 1 | egrep -v '(dpkg|apt)'`

for pkg in $pkgs; do
echo -e "\033[1m * Reinstalling:\033[0m $pkg"

apt-get –reinstall -o Dpkg::Options::="–force-confdef" -o Dpkg::Options::="–force-confold" -y install $pkg || {
echo "ERROR: Reinstallation failed. See reinstall.log for details."
exit 1
}
done

debian-all-packages-reinstall.sh working modified version of Albert Huang and Andreas Fendt script can be also downloaded here.

Note ! Omitting the text confirmation prompts to install newest config or keep maintainer configuration is handled by the argument:

-o Dpkg::Options::="–force-confold

I however still got few NCurses Console selection prompts during the reinstall of about 3200+ .deb packages, so even with this mod the reinstall was not completely automatic.

Note ! During the reinstall few of the packages from the list failed due to being some old unsupported packages this was ejabberd, ircd-hybrid and a 2 / 3 more.
This failure was easily solved by completely purging those packages with the usual

# dpkg –purge $packagename

and reruninng debian-all-packages-reinstall.sh on each of the failing packages.

Note ! The failing packages were just old ones left over from Debian 8 and Debian 9 before the apt-get dist-upgrade towards 10 Duster.
Eventually I got a success by God's grance, after few hours of pains and trials, ending up in a working state package database and a complete set of freshly reinstalled packages.

The only thing I had to do finally is 2 hours of tampering why GNOME did not automatically booted after the system reboot due to failing gdm
until I fixed that I've temprary used ligthdm (x-display-manager), to do I've

dpkg –reconfigure gdm3

lightdm-x-display-manager-screenshot-gdm3-reconfige

to work around this I had to also reinstall few libraries, reinstall the xorg-server, reinstall gdm and reinstall the meta package for GNOME, using below set of commands:

apt-get install –reinstall libglw1-mesa libglx-mesa0
apt-get install –reinstall libglu1-mesa-dev
apt install –reinstallgsettings-desktop-schemas
apt-get install –reinstall xserver-xorg-video-intel
apt-get install –reinstall xserver-xorg
apt-get install –reinstall xserver-xorg-core
apt-get install –reinstall task-desktop
apt-get install –reinstall task-gnome-desktop

As some packages did not ended re-instaled on system because on the original host from where /var/lib/dpkg db was copied did not have it I had to eventually manually trigger reinstall for those too:

apt-get install –reinstall –yes vlc
apt-get install –reinstall –yes thunderbird
apt-get install –reinstall –yes audacity
apt-get install –reinstall –yes gajim
apt-get install –reinstall –yes slack remmina
apt-get install –yes k3b
pt-get install –yes gbgoffice
pt-get install –reinstall –yes skypeforlinux
apt-get install –reinstall –yes vlc
apt-get install –reinstall –yes libcurl3-gnutls libcurl3-nss
apt-get install –yes virtualbox-5.2
apt-get install –reinstall –yes vlc
apt-get install –reinstall –yes alsa-tools-gui
apt-get install –reinstall –yes gftp
apt install ./teamviewer_15.3.2682_amd64.deb –yes

Note that some of above packages requires a properly configured third party repositories, other people might have other packages that are missing from the dpkg list and needs to be reinstalled so just decide according to your own case of left aside working system present binaries that doesn't belong to any dpkg installed package.

After a bit of struggle everything is back to normal Thanks God! 🙂 !

Tags: apt-get, argument, backup copy, copy, course, database backups, deb package, debian linux, dpkg, extundelete, file, idea, information, installed, lib, libraries, list, Machine Working Host, package database, possible, Recover, recover deleted files, recovery, restore, root directory, root root, simplicity, time consuming, undelete, var, without
Posted in Linux, Linux Package Management, System Administration | 1 Comment »

Posts Tagged ‘list’

1.Count ESTABLISHED TCP connections from Windows Command Line

2. Few Useful netstat options for the Windows system admin

3. Use a Third Party GUI tool to debug more interactively connection issues

CurrPorts Tool own Description

Sum it up

2. Update the known repos list to include it

3. Install the btop deb package from backports

0. Check situation before the update on HVs / get VM IDs etc.

1. Dump VM definition XMs (to have it in case if it gets wiped during update)

2. Set on standby virt-mach-centos1 (virt-mach-centos1)

3. Stop VM virt-mach-centos1 & backup on Hypervisor host (hypervisor-host1) for VM1

4. Remove package version locks on all hosts

5. Do yum update virt-mach-centos1

6. Check if everything is running fine after upgrade

7. Stop VM virt-mach-centos2 & backup on Hypervisor host (hypervisor-host2)

8. Do yum update on virt-mach-centos2

9. Check if everything is running fine after upgrade

10. Stop VM vm-host2 & backup

11. Do yum update vm-host2

12. Check if everything is running fine after upgrade

13. Do yum update hypervisor-host2

14. Stop VM vm-host1 & backup

15. Do yum update vm-host2

16. Check if everything is running fine after upgrade

17. Do yum update hypervisor-host1

18. Check if everything is running fine after upgrade

19. Check if everything is running fine after upgrade

20. Check once more VMs and haproxy or any other contained services in VMs run as expected

21. Haproxy Unstandby virt-mach-centos1

22. If necessery to redefine VMs (in case they disappear from virsh) or virtuosso is not working

23. Set versionlock to RPMs to prevent accident updates and check OS version release

Other useful hints

1. Check the journal for errors, running service irregularities and so on

2. List all systemd failed check services which was supposed to be running

3. List all running systemd services for a better overview on what's going on on machine

4. Finding out more on why a systemd configured service has failed

5. Get a complete log of journal to make sure everything configured on server host runs as it should

Closure

Start -> My Computer (right mouse button) Properties

HKEY_LOCAL_MACHINE\SOFTWARE\MicrosoftInetStp\VersionString

Start (button) -> Control Panel -> Administrative Tools -> "Internet Information Services" IIS Manager

Help -> About Microsoft Management Console

1. First step was to restore the directory skele of /var/lib/dpkg

2. Recover missing /var/lib/dpkg/status file

3. Reinstall dpkg package manager to make package management working again

4. Reinstall base-files .deb package which provides basis of a Debian system

5. Do a package sanity and consistency check and try to update OS package list

6. Copy dpkg database from another Linux system that has a working dpkg / apt Database

7. Reinstall All Debian Packages completely scripts

Daily Bible quote

GET ARTICLE UPDATES

Useful blog? Help it:

Links to Other Places

Recent Posts

Ads

Categories

About Myself

Recent Comments

Tags

Top Post Views

blogtopsites