Archive for the ‘Remote System Administration’ Category

Must have software on freshly installed windows – Essential Software after fresh Windows install

Friday, March 18th, 2016

Install-update-multiple-programs-applications-at-once-using-ninite

If you're into IT industry even if you don't like installing frequently Windows or you're completely Linux / BSD user, you will certainly have a lot of friends which will want help from you to re-install or fix their Windows 7 / 8 / 10 OS. At least this is the case with me every year, I'm kinda of obliged to install fresh windowses on new bought friends or relatives notebooks / desktop PCs.

Of course according to for whom the new Windows OS installed the preferrences of necessery software varies, however more or less there is sort of standard list of Windows Software which is used daily by most of Avarage Computer user, such as:
 

Not to forget a good candidate from the list to install on new fresh windows Installation candidates are:

  • Winrar
  • PeaZIP
  • WinZip
  • GreenShot (to be able to easily screenshot stuff and save pictures locally and to the cloud)
  • AnyDesk (non free but very functional alternative to TeamViewer) to be able to remotely access remote PC
  • TightVNC
  • ITunes / Spotify (for people who have also iPhone smart phone)
  • DropBox or pCloud (to have some extra cloud free space)
  • FBReader (for those reading a lot of books in different formats)
  • Rufus – Rufus is an efficient and lightweight tool to create bootable USB drives. It helps you to create BIOS or UEFI bootable devices. It helps you to create Windows TO Go drives. It provides support for various disk, format, and partition.
  • Recuva is a data recovery software for Windows 10 (non free)
  • EaseUS (for specific backup / restore data purposes but unfortunately (non free)
  • For designers
  • Adobe Photoshop
  • Adobe Illustrator
  • f.lux –  to control brightness of screen and potentially Save your eyes
  • ImDisk virtual Disk Driver
  • KeePass / PasswordSafe – to Securely store your passwords
  • Putty / MobaXterm / SecureCRT / mPutty (for system administrators and programmers that has to deal with Linux / UNIX)

I tend to install on New Windows installs and thus I have more or less systematized the process.

I try to usually stick to free software where possible for each of the above categories as a Free Software enthusiast and luckily nowadays there is a lot of non-priprietary or at least free as in beer software available out there.

For Windows sysadmins or College and other public institutions networks including multiple of Windows Computers which are not inside a domain and also for people in computer repair shops where daily dozens of windows pre-installs or a set of software Automatic updates are  necessery make sure to take a look at Ninite

ninite-automate-windows-program-deploy-and-update-on-new-windows-os-openoffice-screenshot

As official website introduces Ninite:

Ninite – Install and Update All Your Programs at Once

Of course as Ninite is used by organizations as NASA, Harvard Medical School etc. it is likely the tool might reports your installed list of Windows software and various other Win PC statistical data to Ninite developers and most likely NSA, but this probably doesn't much matter as this is probably by the moment you choose to have installed a Windows OS on your PC.

ninite-choises-to-build-an-install-package-with-useful-essential-windows-software-screenshot
 

For Windows System Administrators managing small and middle sized network PCs that are not inside a Domain Controller, Ninite could definitely save hours and at cases even days of boring install and maintainance work. HP Enterprise or HP Inc. Employees or ex-employees would definitely love Ninite, because what Ninite does is pretty much like the well known HP Internal Tool PC COE.

Ninite could also prepare an installer containing multiple applications based on the choice on Ninite's website, so that's also a great thing especially if you need to deploy a different type of Users PCs (Scientific / Gamers / Working etc.)

Perhaps there are also other useful things to install on a new fresh Windows installations, if you're using something I'm missing let me know in comments.

Linux extending life time for a damaged hard drive server tricks on a live server. Force fcsk on next reboot.Read-only file system error solutions

Friday, February 17th, 2023

linux-extending-life-time-for-a-damaged-hard-drive-server-tricks-can-not-read-superblock-linux-force-fsck-on-next-reboot

In our daily work as system administrators we have some very old Legacy systems running Clustered High Availability proxies using CRM (Cluster Resource Manager) and some legacy systems still using Heartbeat to manage the cluster instead of the newer and modern Corosync variant.

The HA cluster is only 2 nodes Linux machine and running the obscure already long time unsupported version of Redhat 5.11 (Ootpa) who was officially became stable distant year 1998 (yeath the years were good) and whose EOL (End of Life) has been reached long time ago and the OS is no longer supported, however for about 14 years the machines has been running perfectly fine until one of the Cluster nodes managed by ocf::heartbeat:IPAddr2 , that is  /etc/ha.d/resource.d/IPAddr2 shell script. Yeah for the newbies Heartbeat Application Cluster in Linux does work like that it uses a number of extendable pair of shell scripts written for different kind of Network / Web / Mail / SQL or whatever services HA management.

The first node configured however, started failing due to some errors like:
 

EXT3-fs error (device dm-1): ext3_journal_start_sb: Detected aborted journal
sd 0:2:0:0: rejecting I/O to offline device
Aborting journal on device sda1.
sd 0:2:0:0: rejecting I/O to offline device
printk: 159 messages suppressed.
Buffer I/O error on device sda1, logical block 526
lost page write due to I/O error on sda1
sd 0:2:0:0: rejecting I/O to offline device
sd 0:2:0:0: rejecting I/O to offline device
ext3_abort called.
EXT3-fs error (device sda1): ext3_journal_start_sb: Detected aborted journal
Remounting filesystem read-only
sd 0:2:0:0: rejecting I/O to offline device
sd 0:2:0:0: rejecting I/O to offline device
sd 0:2:0:0: rejecting I/O to offline device
sd 0:2:0:0: rejecting I/O to offline device
sd 0:2:0:0: rejecting I/O to offline device
megaraid_sas: FW was restarted successfully, initiating next stage…
megaraid_sas: HBA recovery state machine, state 2 starting…
megasas: Waiting for FW to come to ready state
megasas: FW in FAULT state!!
FW state [-268435456] hasn't changed in 180 secs
megaraid_sas: out: controller is not in ready state
megasas: waiting_for_outstanding: after issue OCR. 
megasas: waiting_for_outstanding: before issue OCR. FW state = f0000000
megaraid_sas: pending commands remain even after reset handling. megasas[0]: Dumping Frame Phys Address of all pending cmds in FW
megasas[0]: Total OS Pending cmds : 0 megasas[0]: 64 bit SGLs were sent to FW
megasas[0]: Pending OS cmds in FW :

The result out of that was a frequently the filesystem of the machine got re-mounted as Read Only and of course that is
quite bad if you have a running processess of haproxy that should be able to be living their and take up some Web traffic
for high availability and you run all the traffic only on the 2nd pair of machine.

This of course was a clear sign for a failing disks or some hit bad blocks regions or as the messages indicates, some
problem with system hardware or Raid SAS Array.

The physical raid on the system, just like rest of the hardware is very old stuff as well.

[root@haproxy_lb_node1 ~]# lspci |grep -i RAI
01:00.0 RAID bus controller: LSI Logic / Symbios Logic MegaRAID SAS 2108 [Liberator] (rev 05)

The produced errors not only made the machine to auto-mount its root / filesystem in Read-Only mode but besides has most
likely made the machine to automatically reboot every few days or few times every day in a raw.

The second Load Balancer node2 did operated perfectly, and we thought that we might just keep the broken machine in that half running
and inconsistent state for few weeks until we have built the new machines with Pre-Installed new haproxy cluster with modern
RedHat Linux 8.6 distribution, but since we have to follow SLAs (Service Line Agreements) with Customers and the end services behind the
High Availability (HA) Haproxy cluster were at danger … 

We as sysadmins had the task to make our best to try to stabilize the unstable node with disk errors for the system to servive
and be able to normally serve traffic (if node2 that is in a separate Data center fails due to a hardware or electricity issues etc.)
.

Here is few steps we took, that has hopefully improved the situation.

1. Make backups of most important files of high importance

Always before doing anything with a broken system, prepare backup of the most important files, if that is a cluster that should be a backup of the cluster configurations (if you don't have already ones) backup of /etc/hosts / backup of any important services configs /etc/haproxy/haproxy.cfg /etc/postfix/postfix.cfg (like it was my case), preferrably backup of whole /etc/  any important files from /root/ or /home/users* directories backup of at leasts latest logs from /var/log etc.
 

2. Clear up all unnecessery services scripts from the server

Any additional Softwares / Services and integrity checking tools (daemons) / scripts and cron jobs, were immediately stopped and wheter unused removed.

E.g. we had moved through /etc/cron* to check what's there,

# ls -ld /etc/cron.*
drwx—— 2 root root 4096 Feb  7 18:13 /etc/cron.d
drwxr-xr-x 2 root root 4096 Feb  7 17:59 /etc/cron.daily
-rw-r–r– 1 root root    0 Jul 20  2010 /etc/cron.deny
drwxr-xr-x 2 root root 4096 Jan  9  2013 /etc/cron.hourly
drwxr-xr-x 2 root root 4096 Jan  9  2013 /etc/cron.monthly
drwxr-xr-x 2 root root 4096 Aug 26  2015 /etc/cron.weekly

 

And like well professional butchers removed everything unnecessery that could trigger any extra unnecessery disk read / writes to HDD.

E.g. just create

# mkdir -p /root/etc_old/{/etc/cron.d,\
/etc/cron.daily,/etc/cron.hourly,/etc/cron.monthly\
,/etc/cron.weekly}

 

And moved all unnecessery cron job scripts like:

1. nmon (old school network / memory / hard disk console tool for monitoring and tuning server parameters)
2. clamscan / freshclam crons
3. mlocate (the script that is taking care for periodic run of updatedb command to keep the locate command to easily search
for files inside the DB to put less read operations on disk in case if you need to find file (e.g. prevent yourself to everytime
run cmd like: find / . -iname '*whatever_you_look_for*'
4. cups cron jobs
5. logwatch cron
6. rkhunter stuff
7. logrotate (yes we stopped even logrotation trigger job as we found the server was crashing sometimes at the same time when
the lograte job to rotate logs inside /var/log/* was running perhaps leading to a hit of the I/O read error (bad blocks).


Also inspected the Administrator user root cron job for any unwated scripts and stopped two report bash scripts that were part of the PCI tightened Security procedures.
Therein found script responsible to periodically report the list of installed packages and if they have not changed, as well a script to periodically report via email the list of
/etc/{passwd,/etc/shadow} created users, used to historically keep an eye on the list of users and easily see if someone
has created new users on the machine. Those were enabled via /var/spool/cron/root cron jobs, in other cases, on other machines if it happens for you
it is a good idea to check out all the existing user cron jobs and stop anything that might be putting Read / Write extra heat pressure on machine attached the Hard drives.

# ls -al /var/spool/cron/
total 20
drwx——  2 root root 4096 Nov 13  2015 .
drwxr-xr-x 12 root root 4096 May 11  2011 ..
-rw——-  1 root root  133 Nov 13  2015 root


3. Clear up old log files and any files unnecessery

Under /var/log and /home /var/tmp /var/spool/tmp immediately try to clear up the old log files.
From my past experience this has many times made the FS file inodes that are storing on a unbroken part (good blocks) of the hard drive and
ready to be reused by newly written rsyslog / syslogd services spitted files.

!!! Note that during the removal of some files you might hit a files stored on a bad blocks that might lead to a unexpected system reboot.

But that's okay, don't worry most likely after a hard reset by a technician in the Datacenter the machine will boot again and you can enjoy
removing remaining still files to send them to the heaven for old files.

 

4. Trigger an automatic system file system check with fsck on next boot

The standard way to force a Linux to aumatically recheck its Root filesystem is to simply create the /forcefsck to root partition or any other secondary disk partition you would like to check.

# touch /forcefsck

# reboot


However at some occasions you might be unable to do it because, the / (root fs) has been remounted in ReadOnly mode, yackes …

Luckily old Linux distibutions like this RHEL 5.1, has a way to force a filesystem check after reboot fsck and identify any
unknown bad-blocks and hopefully succceed in isolating them, so you don't hit into the same auto-reboots if the hard drive or Software / Hardware RAID
is not in terrible state
, you can use an option built in in /sbin/shutdown command the '-F'

   -F     Force fsck on reboot.


Hence to make the machine reboot and trigger immediately fsck:

# shutdown -rF now


Just In case you wonder why to reboot before check the Filesystem. Well simply because you need to have them unmounted before you check.

In that specific case this produced so far a good result and the machine booted just fine and we crossed the fingers and prayed that the machine would work flawlessly in the coming few weeks, before we finalize the configuration of the substitute machines, where this old infrastructure will be migrated to a new built cluster with new Haproxy and Corosync / Pacemaker Cluster on a brand new RHEL.

NB! On newer machines this won't work however as shutdown command has been stripped off this option because no SystemV (SystemInit) or Upstart and not on SystemD newer services architecture.
 

5. Hints on checking the hard drives with fsck

If you happen to be able to have physical access to the remote Hardare machine via a TTY[1-9] Console, that's even better and is the standard way to do it but with this specific case we had no easy way to get access to the Physical server console.

It is even better to go there and via either via connected Monitor (Display) or KVM Switch (Those who hear KVM switch first time this is a great device in server rooms to connect multiple monitors to same Monitor Display), it is better to use a some of the multitude of options to choose from for USB Distro Linux recovery OS versions or a CDROM / DVD on older machines like this with the Redhat's recovery mode rolled on.
After mounting the partition simply check each of the disks
e.g. :

# fsck -y /dev/sdb
# fsck -y /dev/sdc

Or if you want to not waste time and look for each hard drive but directly check all the ones that are attached and known by Linux distro via /etc/fstab definition run:

# fsck -AR

If necessery and you have a mixture of filesystems for example EXT3 , EXT4 , REISERFS you can tell it to omit some filesystem, for example ext3, like that:

# fsck -AR -t noext3 -y


To skip fsck on mounted partitions with fsck:

# fsck -M /dev/sdb


One remark to make here on fsck is usually fsck to complete its job on various filesystem it uses other external component binaries usually stored in /sbin/fsck*

ls -al /sbin/fsck*
-rwxr-xr-x 1 root root  55576 20 яну 2022 /sbin/fsck*
-rwxr-xr-x 1 root root  43272 20 яну 2022 /sbin/fsck.cramfs*
lrwxrwxrwx 1 root root      9  4 юли 2020 /sbin/fsck.exfat -> exfatfsck*
lrwxrwxrwx 1 root root      6  7 юни 2021 /sbin/fsck.ext2 -> e2fsck*
lrwxrwxrwx 1 root root      6  7 юни 2021 /sbin/fsck.ext3 -> e2fsck*
lrwxrwxrwx 1 root root      6  7 юни 2021 /sbin/fsck.ext4 -> e2fsck*
-rwxr-xr-x 1 root root  84208  8 фев 2021 /sbin/fsck.fat*
-rwxr-xr-x 2 root root 393040 30 ное 2009 /sbin/fsck.jfs*
-rwxr-xr-x 1 root root 125184 20 яну 2022 /sbin/fsck.minix*
lrwxrwxrwx 1 root root      8  8 фев 2021 /sbin/fsck.msdos -> fsck.fat*
-rwxr-xr-x 1 root root    333 16 дек 2021 /sbin/fsck.nfs*
lrwxrwxrwx 1 root root      8  8 фев 2021 /sbin/fsck.vfat -> fsck.fat*


6. Using tune2fs to  adjust tunable filesystem parameters on ext2/ext3/ext4 filesystems (few examples)

a) To check whether really the filesystem was checked on boot time or check a random filesystem on the server for its last check up date with fsck:

#  tune2fs -l /dev/sda1 | grep checked
Last checked:             Wed Apr 17 11:04:44 2019

On some distributions like old Debian and Ubuntu, it is even possible to enable fsck to log its operations during check on reboot via changing the verbosity from NO to YES:

# sed -i "s/#VERBOSE=no/VERBOSE=yes/" /etc/default/rcS


If you're having the issues on old Debian Linuxes  and not on RHEL  it is possible to;

b) Enable all fsck repairs automatic on boot

by running via:
 

# sed -i "s/FSCKFIX=no/FSCKFIX=yes/" /etc/default/rcS


c) Forcing fcsk check on for server attached Hard Drive Partitions with tune2fs

# tune2fs -c 1 /dev/sdXY

Note that:
tune2fs can force a fsck on each reboot for EXT4, EXT3 and EXT2 filesystems only.

tune2fs can trigger a forced fsck on every reboot using the -c (max-mount-counts) option.
This option sets the number of mounts after which the filesystem will be checked, so setting it to 1 will run fsck each time the computer boots.
Setting it to -1 or 0 resets this (the number of times the filesystem is mounted will be disregarded by e2fsck and the kernel).


 For example you could:

d) Set fsck to run a filesystem check every 30 boots, by using -c 30 
 

# tune2fs -c 30 /dev/sdXY


e) Checking whether a Hard Drive has been really checked on the boot

 

#  tune2fs -l /dev/sda1 | grep checked
Last checked:             Wed Apr 17 11:04:44 2019


e) Check when was the last time the file system /dev/sdX was checked:
 

# tune2fs -l /dev/sdX | grep Last\ c
Last checked:             Thu Jan 12 20:28:34 2017


f) Check how many times our /dev/sdX filesystem was mounted

# tune2fs -l /dev/sdX | grep Mount
Mount count:              157

g) Check how many mounts are allowed to pass before filesystem check is forced
 

# tune2fs -l /dev/sdX | grep Max
Maximum mount count:      -1


7. Repairing disk / partitions via GRUB fsck.mode and fsck.repair kernel module options

It is also possible to force a fsck.repair on boot via GRUB, but that usually is not an option someone would like as the machine might fail too boot if it hards to repair hardly, however in difficult situations with failing disks temporary enabling it is good idea.

This can be done by including for grub initial config

GRUB_CMDLINE_LINUX_DEFAULT="quiet splash fsck.mode=force fsck.repair=yes"

fsck.mode=force – will force a fsck each time a system boot and keeping that value enabled for a long time inside GRUB is stupid for servers as

sometimes booting could be severely prolonged because of the checks especially with servers with many or slow old hard drives.

fsck.repair=yes – will make the fsck try to repair if it finds bad blocks when checking (be absolutely sure you know, what you're doing if passing this options)

The options can be also set via editing the GRUB boot screen, if you have physical access to the server and don't want to reload the grub loader and possibly make the machine unbootable on next boot.
 

8. Few more details on how /etc/fstab disk fsck check parameters values for Systemd Linux machines works

The "proper" way on systemd (if we can talk about proper way on Linux) to runs fsck for each filesystem that has a fsck is to pass number greater than 0 set in
/etc/fstab (last column in /etc/fstab), so make sure you edit your /etc/fstab if that's not the case.

The root partition should be set to 1 (first to be checked), while other partitions you want to be checked should be set to 2.

Example /etc/fstab:
 

# /etc/fstab: static file system information.

/dev/sda1  /      ext4  errors=remount-ro  0  1
/dev/sda5  /home  ext4  defaults           0  2

The values you can put here as a second number meaning is as follows:
0 – disabled, that is do not check filesystem
1 – partition with this PASS value has a higher priority and is checked first. This value is usually set to the root / partition
2 – partitions with this PASS value will be checked last

a) Check the produced log out of fsck

Unfortunately on the older versions of Linux distros with SystemV fsck log output might be not generated except on the physical console so if you have a kind of duplicator device physical tty on the display port of the server, you might capture some bad block reports or fixed errors messages, but if you don't you might just cross the fingers and hope that anything found FS irregularities was recovered.

On systemd Linux machines the fsck log should be produced either in /run/initramfs/fsck.log or some other location depending on the Linux distro and you should be able to see something from fsck inside /var/log/* logs:

# grep -rli fsck /var/log/*


Close it up

Having a system with failing disk is a really one of the worst sysadmin nightmares to get. The good news is that most of the cases we're prepared with some working backup or some work around stuff like the few steps explained to mitigate the amount of Read / Writes to hard disks on the failing machine HDDs. If the failing disk is a primary Linux filesystem all becomes even worse as every next reboot, you have no guarantee, whether the kernel / initrd or some of the other system components required to run the Core Linux system won't break up the normal boot. Thus one side changes on the hard drives is a risky business on ther other side, if you're in a situation where you have a mirror system or the failing system is just a Linux server installed without a Cluster pair, then this is not a big deal as you can guarantee at least one of the nodes still up, unning and serving. Still doing too much of operations with HDD is always a danger so the steps described, though in most cases leading to improvement on how the system behaves, the system should be considered totally unreliable and closely monitored not only by some monitoring stuff like Zabbix / Prometheus whatever but regularly check the systems state via normal SSH logins. It is important if you have some important datas or logs on the system that are not synchronized to a system node to copy them before doing any of the described operations. After all minimal is backuped, proceed to clear up everything that might be cleared up and still the machine to continue providing most of its functionalities, trigger fsck automatic HDD check on next reboot, reboot, check what is going on and monitor the machine from there on.

Hopefully the few described steps, has helped some sysadmin. There is plenty of things which I've described that might go wrong, even following the described steps, might not help if the machines Storage Drives / SAS / SSD has too much of a damage. But as said in most cases following this few steps would improve the machine state.

Wish you the best of luck!

 

Log rsyslog script incoming tagged string message to separate external file to prevent /var/log/message from string flood

Wednesday, December 22nd, 2021

rsyslog_logo-log-external-tag-scripped-messages-to-external-file-linux-howto

If you're using some external bash script to log messages via rsyslogd to some of the multiple rsyslog understood data tubes (called in rsyslog language facility levels) and you want Rsyslog to move message string to external log file, then you had the same task as me few days ago.

For example you have a bash shell script that is writting a message to rsyslog daemon to some of the predefined facility levels be it:
 

kern,user,cron, auth etc. or some local

and your logged script data ends under the wrong file location /var/log/messages , /var/log/secure , var/log/cron etc. However  you need to log everything coming from that service to a separate file based on the localX (fac. level) the usual way to do it is via some config like, as you would usually do it with rsyslog variables as:
 

local1.info                                            /var/log/custom-log.log

# Don't log private authentication messages!
*.info;mail.none;authpriv.none;cron.none;local0.none;local1.none        /var/log/messages


Note the local1.none is instructing the rsyslog not to log anything from local1 facility towards /var/log/message. 
But what if this due to some weirdness in configuration of rsyslog on the server or even due to some weird misconfiguration in

/etc/systemd/journald.conf such as:

[Journal]
Storage=persistent
RateLimitInterval=0s
RateLimitBurst=0
SystemMaxUse=128M
SystemMaxFileSize=32M
MaxRetentionSec=1month
MaxFileSec=1week
ForwardToSyslog=yes
SplitFiles=none

Due to that config and especially the FowardToSyslog=yes, the messages sent via the logger tool to local1 still end up inside /var/log/messages, not nice huh ..

The result out of that is anything being sent with a predefined TAGGED string via the whatever.sh script which uses the logger command  (if you never use it check man logger) to enter message into rsyslog with cmd like:
 

# logger -p local1.info -t TAG_STRING

# logger -p local2.warn test
# tail -2 /var/log/messages
Dec 22 18:58:23 pcfreak rsyslogd: — MARK —
Dec 22 19:07:12 pcfreak hipo: test


was nevertheless logged to /var/log/message.
Of course /var/log/message becomes so overfilled with "junk" shell script data not related to real basic Operating system adminsitration, so this prevented any critical or important messages that usually should come under /var/log/message / /var/log/syslog to be lost among the big quantities of other tagged tata reaching the log.

After many attempts to resolve the issue by modifying /etc/rsyslog.conf as well as the messed /etc/systemd/journald.conf (which by the way was generated with this strange values with an OS install time automation ansible stuff). It took me a while until I found the solution on how to tell rsyslog to log the tagged message strings into an external separate file. From my 20 minutes of research online I have seen multitudes of people in different Linux OS versions to experience the same or similar issues due to whatever, thus this triggered me to write this small article on the solution to rsyslog.

The solution turned to be pretty easy but requires some further digging into rsyslog, Redhat's basic configuration on rsyslog documentation is a very nice reading for starters, in my case I've used one of the Propery-based compare-operations variable contains used to select my tagged message string.
 

1. Add msg contains compare-operations to output log file and discard the messages

[root@centos bin]# vi /etc/rsyslog.conf

# config to log everything logged to rsyslog to a separate file
:msg, contains, "tag_string:/"         /var/log/custom-script-log.log
:msg, contains, "tag_string:/"    ~

Substitute quoted tag_string:/ to whatever your tag is and mind that it is better this config is better to be placed somewhere near the beginning of /etc/rsyslog.conf and touch the file /var/log/custom-script-log.log and give it some decent permissions such as 755, i.e.
 

1.1 Discarding a message


The tilda sign –  

as placed to the end of the msg, contains is the actual one to tell the string to be discarded so it did not end in /var/log/messages.

Alternative rsyslog config to do discard the unwanted message once you have it logged is with the
rawmsg variable, like so:

 

# config to log everything logged to rsyslog to a separate file
:msg, contains, "tag_string:/"         /var/log/custom-script-log.log
:rawmsg, isequal, "tag_string:/" stop

Other way to stop logging immediately after log is written to custom file across some older versions of rsyslog is via the &stop

:msg, contains, "tag_string:/"         /var/log/custom-script-log.log
& stop

I don't know about other versions but Unfortunately the &stop does not work on RHEL 7.9 with installed rpm package rsyslog-8.24.0-57.el7_9.1.x86_64.

1.2 More with property based filters basic exclusion of string 

Property based filters can do much more, you can for example, do regular expression based matches of strings coming to rsyslog and forward to somewhere.

To select syslog messages which do not contain any mention of the words fatal and error with any or no text between them (for example, fatal lib error), type:

:msg, !regex, "fatal .* error"

 

2. Create file where tagged data should be logged and set proper permissions
 

[root@centos bin]# touch /var/log/custom-script-log.log
[root@centos bin]# chmod 755 /var/log/custom-script-log.log


3. Test rsyslogd configuration for errors and reload rsyslog

[root@centos ]# rsyslogd -N1
rsyslogd: version 8.24.0-57.el7_9.1, config validation run (level 1), master config /etc/rsyslog.conf
rsyslogd: End of config validation run. Bye.

[root@centos ]# systemctl restart rsyslog
[root@centos ]#  systemctl status rsyslog 
● rsyslog.service – System Logging Service
   Loaded: loaded (/usr/lib/systemd/system/rsyslog.service; enabled; vendor preset: enabled)
   Active: active (running) since Wed 2021-12-22 13:40:11 CET; 3h 5min ago
     Docs: man:rsyslogd(8)
           http://www.rsyslog.com/doc/
 Main PID: 108600 (rsyslogd)
   CGroup: /system.slice/rsyslog.service
           └─108600 /usr/sbin/rsyslogd -n

 

4. Property-based compare-operations supported by rsyslog table
 

Compare-operation Description
contains Checks whether the provided string matches any part of the text provided by the property. To perform case-insensitive comparisons, use  contains_i .
isequal Compares the provided string against all of the text provided by the property. These two values must be exactly equal to match.
startswith Checks whether the provided string is found exactly at the beginning of the text provided by the property. To perform case-insensitive comparisons, use  startswith_i .
regex Compares the provided POSIX BRE (Basic Regular Expression) against the text provided by the property.
ereregex Compares the provided POSIX ERE (Extended Regular Expression) regular expression against the text provided by the property.
isempty Checks if the property is empty. The value is discarded. This is especially useful when working with normalized data, where some fields may be populated based on normalization result.

 


5. Rsyslog understanding Facility levels

Here is a list of facility levels that can be used.

Note: The mapping between Facility Number and Keyword is not uniform over different operating systems and different syslog implementations, so among separate Linuxes there might be diference in the naming and numbering.

Facility Number Keyword Facility Description
0 kern kernel messages
1 user user-level messages
2 mail mail system
3 daemon system daemons
4 auth security/authorization messages
5 syslog messages generated internally by syslogd
6 lpr line printer subsystem
7 news network news subsystem
8 uucp UUCP subsystem
9   clock daemon
10 authpriv security/authorization messages
11 ftp FTP daemon
12 NTP subsystem
13 log audit
14 log alert
15 cron clock daemon
16 local0 local use 0 (local0)
17 local1 local use 1 (local1)
18 local2 local use 2 (local2)
19 local3 local use 3 (local3)
20 local4 local use 4 (local4)
21 local5 local use 5 (local5)
22 local6 local use 6 (local6)
23 local7 local use 7 (local7)


6. rsyslog Severity levels (sublevels) accepted by facility level

As defined in RFC 5424, there are eight severity levels as of year 2021:

Code Severity Keyword Description General Description
0 Emergency emerg (panic) System is unusable. A "panic" condition usually affecting multiple apps/servers/sites. At this level it would usually notify all tech staff on call.
1 Alert alert Action must be taken immediately. Should be corrected immediately, therefore notify staff who can fix the problem. An example would be the loss of a primary ISP connection.
2 Critical crit Critical conditions. Should be corrected immediately, but indicates failure in a primary system, an example is a loss of a backup ISP connection.
3 Error err (error) Error conditions. Non-urgent failures, these should be relayed to developers or admins; each item must be resolved within a given time.
4 Warning warning (warn) Warning conditions. Warning messages, not an error, but indication that an error will occur if action is not taken, e.g. file system 85% full – each item must be resolved within a given time.
5 Notice notice Normal but significant condition. Events that are unusual but not error conditions – might be summarized in an email to developers or admins to spot potential problems – no immediate action required.
6 Informational info Informational messages. Normal operational messages – may be harvested for reporting, measuring throughput, etc. – no action required.
7 Debug debug Debug-level messages. Info useful to developers for debugging the application, not useful during operations.


7. Sample well tuned configuration using severity and facility levels and immark, imuxsock, impstats
 

Below is sample config using severity and facility levels
 

# Don't log private authentication messages!
*.info;mail.none;authpriv.none;cron.none;local0.none;local1.none        /var/log/messages


Note the local0.none; local1.none tells rsyslog to not log from that facility level to /var/log/messages.

If you need a complete set of rsyslog configuration fine tuned to have a proper logging with increased queues and included configuration for loggint to remote log aggegator service as well as other measures to prevent the system disk from being filled in case if something goes wild with a logging service leading to a repeatedly messages you might always contact me and I can help 🙂
 Other from that sysadmins might benefit from a sample set of configuration prepared with the Automated rsyslog config builder  or use some fine tuned config  for rsyslog-8.24.0-57.el7_9.1.x86_64 on Redhat 7.9 (Maipo)   rsyslog_config_redhat-2021.tar.gz.

To sum it up rsyslog though looks simple and not an important thing to pre

Defining multiple short Server Hostname aliases via SSH config files and defining multiple ssh options for it, Use passwordless authentication via public keys

Thursday, September 16th, 2021

using-ssh-host-acronym-aliases-ssh-client-explained-openssh-logo

In case you have to access multiple servers from your terminal client such as gnome-terminal, kterminal (if on Linux) or something such as mobaxterm + cygwin (if on Windows) with an opens ssh client (ssh command). There is a nifty trick to save time and keyboard typing through creating shortcuts aliases by adding few definitions inside your $HOME/.ssh/config ( ~/.ssh/config ) for your local non root user or even make the configuration system wide (for all existing local /etc/passwd users) via /etc/ssh/ssh_config.
By adding a pseudonym alias for each server it makes sysadmin life much easier as you don't have to type in each time the FQDN (Fully Qualified Domain Name) hostname of remote accessed Linux / Unix / BSD / Mac OS or even Windows sshd ready hosts accessible via remote TCP/IP port 22.


1. Adding local user remote server pointer aliases via ~/.ssh/config


The file ~/.ssh/config is read by the ssh client part of the openssh-client (Linux OS package) on each invokement of the client, and besides defining a pseudonym for the hosts you like to save you time when accessing remote host and hence increase your productivity. Moreover you can also define various other nice options through it to define specifics of remote ssh session for each desired host such as remote host default SSH port (for example if your OpenSSHD is configured to run on non-standard SSH port as lets say 2022 instead of default port TCP 22 for some reason, e.g. security through obscurity etc.).

 

The general syntax of .ssh/config file si simplistic, it goes like this:
 

Host MACHNE_HOSTNAME

SSH_OPTION1 value1
SSH_OPTION1 value1 value2
SSH_OPTION2 value1 value2

 

Host MACHNE_HOSTNAME

SSH_OPTION value
SSH_OPTION1 value1 value2

  • Another understood syntax if you prefer to not have empty whitespaces is to use ( = )
    between the parameter name and values.

Host MACHINE_HOSTNAME
SSH_config=value
SSH_config1=value1 value2

  • All empty lines and lines starting with the hash shebang sign ( # ) would be ignored.
  • All values are case-sensitive, but parameter names are not.

If you have never so far used the $HOME/.ssh/config you would have to create the file and set the proper permissions to it like so:

mkdir -p $HOME/.ssh
chmod 0700 $HOME/.ssh


Below are examples taken from my .ssh/config configuration for all subdomains for my pcfreak.org domain

 

# Ask for password for every subdomain under pc-freak.net for security
Host *.pcfreak.org
user hipopo
passwordauthentication yes
StrictHostKeyChecking no

# ssh public Key authentication automatic login
Host www1.pc-freak.net
user hipopo
Port 22
passwordauthentication no
StrictHostKeyChecking no

UserKnownHostsFile /dev/null

Host haproxy2
    Hostname 213.91.190.233
    User root
    Port 2218
    PubkeyAuthentication yes
    IdentityFile ~/.ssh/haproxy2.pub    
    StrictHostKeyChecking no
    LogLevel INFO     

Host pcfrxenweb
    Hostname 83.228.93.76
    User root
    Port 2218

    PubkeyAuthentication yes
    IdentityFile ~/.ssh/pcfrxenweb.key    
    StrictHostKeyChecking no

Host pcfreak-sf
    Hostname 91.92.15.51
    User root
    Port 2209
    PreferredAuthentications password
    StrictHostKeyChecking no

    Compression yes


As you can see from above configuration the Hostname could be referring either to IP address or to Hostname.

Now to connect to defined IP 91.92.15.51 you can simply refer to its alias

$ ssh pcfreak-sf -v

and you end up into the machine ssh on port 2209 and you will be prompted for a password.

$ ssh pcfrxenweb -v


would lead to IP 83.228.93.76 SSH on Port 2218 and will use the defined public key for a passwordless login and will save you the password typing each time.

Above ssh command is a short alias you can further use instead of every time typing:

$ ssh -i ~/.ssh/pcfrxenweb.key -p 2218 root@83.228.93.76

There is another nifty trick worthy to mention, if you have a defined hostname such as the above config haproxy2 to use a certain variables, but you would like to override some option for example you don't want to connet by default with User root, but some other local account, lets say ssh as devuser@haproxy2 you can type:

$ ssh -o "User=dev" devuser

StrictHostKeyChecking no

– variable will instruct the ssh to not check if the finger print of remote host has changed. Usually this finger print check sum changes in case if for example for some reason the opensshd gets updated or the default /etc/ssh/ssh_host_dsa_key /etc/ssh/sshd_host_dsa_* files have changed due to some reason.
Of course you should use this option only if you tend to access your remote host via a secured VPN or local network, otherwise the Host Key change could be an indicator someone is trying to intercept your ssh session.

 

Compression yes


– variable  enables compression of connection saves few bits was useful in the old modem telephone lines but still could save you few bits
It is also possible to define a full range of IP addresses to be accessed with one single public rsa / dsa key

Below .ssh/config
 

Host 192.168.5.?
     Hostname 192.168.2.18
     User admin
     IdentityFile ~/.ssh/id_ed25519.pub


Would instruct each host attemted to be reached in the IP range of 192.168.2.1-254 to be automatically reachable by default with ssh client with admin user and the respective ed25519.pub key.
 

$ ssh 192.168.1.[1-254] -v

 

2. Adding ssh client options system wide for all existing local or remote LDAP login users


The way to add any Host block is absolutely the same as with a default user except you need to add the configuration to /etc/ssh/ssh_config. Here is a confiugaration from mine Latest Debian Linux

$ cat /etc/ssh/ssh_config

# This is the ssh client system-wide configuration file.  See
# ssh_config(5) for more information.  This file provides defaults for
# users, and the values can be changed in per-user configuration files
# or on the command line.

# Configuration data is parsed as follows:
#  1. command line options
#  2. user-specific file
#  3. system-wide file
# Any configuration value is only changed the first time it is set.
# Thus, host-specific definitions should be at the beginning of the
# configuration file, and defaults at the end.

# Site-wide defaults for some commonly used options.  For a comprehensive
# list of available options, their meanings and defaults, please see the
# ssh_config(5) man page.

Host *
#   ForwardAgent no
#   ForwardX11 no
#   ForwardX11Trusted yes
#   PasswordAuthentication yes
#   HostbasedAuthentication no
#   GSSAPIAuthentication no
#   GSSAPIDelegateCredentials no
#   GSSAPIKeyExchange no
#   GSSAPITrustDNS no
#   BatchMode no
#   CheckHostIP yes
#   AddressFamily any
#   ConnectTimeout 0
#   StrictHostKeyChecking ask
#   IdentityFile ~/.ssh/id_rsa
#   IdentityFile ~/.ssh/id_dsa
#   IdentityFile ~/.ssh/id_ecdsa
#   IdentityFile ~/.ssh/id_ed25519
#   Port 22
#   Protocol 2
#   Ciphers aes128-ctr,aes192-ctr,aes256-ctr,aes128-cbc,3des-cbc
#   MACs hmac-md5,hmac-sha1,umac-64@openssh.com
#   EscapeChar ~
#   Tunnel no
#   TunnelDevice any:any
#   PermitLocalCommand no
#   VisualHostKey no
#   ProxyCommand ssh -q -W %h:%p gateway.example.com
#   RekeyLimit 1G 1h
    SendEnv LANG LC_*
    HashKnownHosts yes
    GSSAPIAuthentication yes

As you can see pretty much can be enabled by default such as the forwarding of the Authentication agent option ( -A ) option, necessery for some Company server environments to be anbled. So if you have to connect to remote host with enabled Agent Forwarding instead of typing

ssh -A user@remotehostname


To enable Agent Forwarding instead of

ssh -X user@remotehostname


Simply uncomment and set to yes
 

ForwardX11 yes
ForwardX11Trusted yes


Just simply uncomment above's config ForwardAgent no

As you can see ssh could do pretty much, you can configure enable SSH Tunneling or run via a Proxy with the ProxyCommand (If it is the first time you hear about ProxyCommand I warmly recommend you check my previous article – How to pass SSH traffic through a secured Corporate Proxy Server with corkscrew).

Sometimes for a defines hostname, due to changes on remote server ssh configuration, SSH encryption type or a host key removal you might end up with issues connecting, therefore to override all the previously defined options inside .ssh/config by ignoring the configuration with -F /dev/null

$ ssh -F /dev/null user@freak -v


What we learned ?

To sum it up In this article, we have learned how to easify the stressed sysadmin life, by adding Aliases with certain port numbering and configurations for different remote SSH administrated Linux / Unix, hosts via local ~/.ssh/config or global wide /etc/ssh/ssh_config configuration options, as well as how already applied configuration from ~/.ssh/config affecting each user ssh command execution, could be overriden.

Update reverse sshd config with cronjob to revert if sshd reload issues

Friday, February 12th, 2021

Update-reverse-sshd-config-with-cronjob-to-revert-if-sshd-reload-issues

Say you're doing ssh hardening modifying /etc/ssh/sshd_config for better system security or just changing options in sshd due to some requirements. But you follow the wrong guide and you placed some ssh variable which is working normally on newer SSH versions ssh OpenSSH_8.0p1 / or 7 but the options are applied on older SSH server and due to that restarting sshd via /etc/init.d/… or systemctl restart sshd cuts your access to remote server located in a DC and not attached to Admin LAN port, and does not have a working ILO or IDRAC configured and you have to wait for a couple of hours for some Support to go to the server Room / Rack / line location to have access to a Linux physical tty console and fix it by reverting the last changes you made to sshd and restarting.

Thus logical question comes what can you do to assure yourself you would not cut your network access to remote machine after modifying OpenSSHD and normal SSHD restart?

There is an old trick, I'm using for years now but perhaps if you're just starting with Linux as a novice system administrator or a server support guy you would not know it, it is as simple as setting a cron job for some minutes to periodically overwrite the sshd configuration with a copy of the old working version of sshd before modification.

Here is this nice nify trick which saved me headache of call on technical support line to ValueWeb when I was administering some old Linux servers back in the 2000s

root@server:~# crontab -u root -e

# create /etc/ssh/sshd_config backup file
cp -rpf /etc/ssh/sshd_config /etc/ssh/sshd_config_$(date +%d-%m-%y)
# add to cronjob to execute every 15 minutes and ovewrite sshd with the working version just in case
*/15 * * * * /bin/cp -rpf /etc/ssh/sshd_config_$(date +%d-%m-%y) /etc/ssh/sshd_config && /bin/systemctl restart sshd
# restart sshd 
cp -rpf /etc/ssh/sshd_config_$(date +%d-%m-%y) /etc/ssh/sshd_config && /bin/systemctl restart sshd


Copy paste above cron definitions and leave them on for some time. Do the /etc/ssh/sshd_config modifications and once you're done restart sshd by lets say

root@server:~#  killall -HUP sshd 


If the ssh connectivity continues to work edit the cron job again and delete all lines and save again.
If you're not feeling confortable with vim as a text editor (in case you're a complete newbie and you don't know) how to get out of vim. Before doing all little steps you can do on the shell with  export EDITOR=nano or export EDITOR=mcedit cmds,this will change the default text editor on the shell. 

Hope this helps someone… Enjoy 🙂

Set all logs to log to to physical console /dev/tty12 (tty12) on Linux

Wednesday, August 12th, 2020

tty linux-logo how to log everything to last console terminal tty12

Those who administer servers from the days of birth of Linux and who used actively GNU / Linux over the years or any other UNIX knows how practical could be to configure logging of all running services / kernel messages / errors and warnings on a physical console.

Traditionally from the days I was learning Linux basics I was shown how to do this on an old Debian Sarge 3.0 Linux without systemd and on all Linux distributions Redhat 9.0 / Calderas and Mandrakes I've used either as a home systems or for servers. I've always configured output of all messages to go to the last easy to access console /dev/tty12 (for those who never use it console switching under Linux plain text console mode is done with key combination of CTRL + ALT + F1 .. F12.

In recent times however with the introduction of systemd pretty much things changed as messages to console are not handled by /etc/inittab which was used to add and refresh physical consoles tty1, tty2 … tty7 (the default added one on Linux were usually 7), but I had to manually include more respawn lines for each console in /etc/inittab.
Nowadays as of year 2020 Linux distros /etc/inittab is no longer there being obsoleted and console print out of INPUT / OUTPUT messages are handled by systemd.
 

1. Enable Physical TTYs from TTY8 till TTY12 etc.


The number of default consoles existing in most Linux distributions I've seen is still from tty1 to tty7. Hence to add more tty consoles and be ready to be able to switch out  not only towards tty7 but towards tty12 once you're connected to the server via a remote ILO (Integrated Lights Out) / IdRAC (Dell Remote Access Controller) / IPMI / IMM (Imtegrated Management Module), you have to do it by telling systemd issuing below systemctl commands:
 

 

 # systemctl enable getty@tty8.service Created symlink /etc/systemd/system/getty.target.wants/getty@tty8.service -> /lib/systemd/system/getty@.service.

systemctl enable getty@tty9.service

Created symlink /etc/systemd/system/getty.target.wants/getty@tty9.service -> /lib/systemd/system/getty@.service.

systemctl enable getty@tty10.service

Created symlink /etc/systemd/system/getty.target.wants/getty@tty10.service -> /lib/systemd/system/getty@.service.

systemctl enable getty@tty11.service

Created symlink /etc/systemd/system/getty.target.wants/getty@tty11.service -> /lib/systemd/system/getty@.service.

systemctl enable getty@tty12.service

Created symlink /etc/systemd/system/getty.target.wants/getty@tty12.service -> /lib/systemd/system/getty@.service.


Once the TTYS tty7 to tty12 are enabled you will be able to switch to this consoles either if you have a physical LCD / CRT monitor or KVM switch connected to the machine mounted on the Rack shelf once you're in the Data Center or will be able to see it once connected remotely via the Management IP Interface (ILO) remote console.
 

2. Taking screenshot of the physical console TTY with fbcat


For example below is a screenshot of the 10th enabled tty10:

tty10-linux-screenshot-fbcat-how-to-screenshot-console

As you can in the screenshot I've used the nice tool fbcat that can be used to make a screenshot of remote console. This is very useful especially if remote access via a SSH client such as PuTTY / MobaXterm is not there but you have only a physical attached monitor access on a DCs that are under a heavy firewall that is preventing anyone to get to the system remotely. For example screenshotting the physical console in case if there is a major hardware failure occurs and you need to dump a hardware error message to a flash drive that will be used to later be handled to technicians to analyize it and exchange the broken server hardware part.

Screenshots of the CLI with fbcat is possible across most Linux distributions where as usual.

In Debian you have to first instal the tool via :
 

# apt install –yes fbcat


and on RedHats / CentOS / Fedoras

# yum install -y fbcat


Taking screenshot once tool is on the server of whatever you have printed on console is as easy as

# fbcat > tty_name.ppm


Note that you might want to convert the .ppm created picture to png with any converter such as imagemagick's convert command or if you have a GUI perhaps with GNU Image Manipulation Tool (GIMP).

3. Enabling every rsyslog handled message to log to Physical TTY12


To make everything such as errors, notices, debug, warning messages  become instantly logging towards above added new /dev/tty12.

Open /etc/rsyslog.conf and to the end of the file append below line :
 

daemon,mail.*;\
   news.=crit;news.=err;news.=notice;\
   *.=debug;*.=info;\
   *.=notice;*.=warn   /dev/tty12


To make rsyslog load its new config restart it:

 

# systemctl status rsyslog

 

 

 

rsyslog.service – System Logging Service
   Loaded: loaded (/lib/systemd/system/rsyslog.service; enabled; vendor preset: enabled)
   Active: active (running) since Mon 2020-08-10 04:09:36 EEST; 2 days ago
     Docs: man:rsyslogd(8)
           https://www.rsyslog.com/doc/
 Main PID: 671 (rsyslogd)
    Tasks: 4 (limit: 4915)
   Memory: 12.5M
   CGroup: /system.slice/rsyslog.service
           └─671 /usr/sbin/rsyslogd -n -iNONE

 

авг 12 00:00:05 pcfreak rsyslogd[671]:  [origin software="rsyslogd" swVersion="8.1901.0" x-pid="671" x-info="https://www.rsyslo
Warning: Journal has been rotated since unit was started. Log output is incomplete or unavailable.

 

systemctl restart rsyslog


That's all folks navigate by pressing simultaneously CTRL + ALT + F12 to get to TTY12 or use ALT + LEFT / ALT + RIGHT ARROW (console switch commands) till you get to the console where everything should be now logged.

Enjoy and if you like this article share to tell your sysadmin friends about this nice hack  ! 🙂

 

 

 

ipmitool: Reset and manage IPMI (Intelligent Platform Management Interface) / ILO (Integrated Lights Out) remote board on Linux servers

Friday, December 20th, 2019

ipmitool-how-to-get-information-about-hardware-and-reset-ipmi-bmc-linux-access-ipmi-ilo-interface-logo

As a system administration nomatter whether you manage a bunch of server in a own brew and run Data Center location with some Rack mounted Hardware like PowerEdge M600 / ProLiant DL360e G8 / ProLiant DL360 Gen9 (755258-B21) or you're managing a bunch of Dedicated Servers, you're or will be faced  at some point to use the embedded in many Rack mountable rack servers IPMI / ILO interface remote console board management. If IPMI / ILO terms are new for you I suggest you quickly read my earlier article What is IPMI / IPKVM / ILO /  DRAC Remote Management interfaces to server .

hp-proliant-bl460c-ILO-Interface-screenshot

HP Proliant BL460 C IPMI (ILO) Web management interface 

In short Remote Management Interface is a way that gives you access to the server just like if you had a Monitor and a Keyboard plugged in directly to server.
When a remote computer is down the sysadmin can access it through IPMI and utilize a text console to the boot screen.
The IPMI protocol specification is led by Intel and was first published on September 16, 1998. and currently is supported by more than 200 computer system vendors, such as Cisco, Dell, Hewlett Packard Enterprise, Intel, NEC Corporation, SuperMicro and Tyan and is a standard for remote board management for servers.

IPMI-Block-Diagram-how-ipmi-works-and-its-relation-to-BMC
As you can see from diagram Baseboard Management Controllers (BMCs) is like the heart of IPMI.

Having this ILO / IPMI access is usually via a Web Interface Java interface that gives you the console and usually many of the machines also have an IP address via which a normal SSH command prompt is available giving you ability to execute diagnostic commands to the ILO on the status of attached hardware components of the server / get information about the attached system sensors to get report about things such as:

  • The System Overall heat
  • CPU heat temperature
  • System fan rotation speed cycles
  • Extract information about the server chassis
  • Query info about various system peripherals
  • Configure BIOS or UEFI on a remote system with no monitor / keyboard attached

Having a IPMI (Intelligent Platform Management Interface) firmware embedded into the server Motherboard is essential for system administration because besides this goodies it allows you to remotely Install Operating System to a server without any pre-installed OS right after it is bought and mounted to the planned Data Center Rack nest, just like if you have a plugged Monitor / Keyboard and Mouse and being physically in the remote location.

IPMI is mega useful for system administration also in case of Linux / Windows system updates that requires reboot in which essential System Libraries or binaries are updated and a System reboot is required, because often after system Large bundle updates or Release updates the system fails to boot and you need a way to run a diagnostic stuff from a System rescue Operating System living on a plugged in via a USB stick or CD Drive.
As prior said IPMI remote board is usually accessed and used via some Remote HTTPS encrypted web interface or via Secure Shell crypted session but sometimes the Web server behind the IPMI Web Interface is hanging especially when multiple sysadmins try to access it or due to other stuff and at times due to strange stuff even console SSH access might not be there, thansfully those who run a GNU / Linux Operating system on the Hardware node can use ipmitool tool http://ipmitool.sourceforge.net/ written for Linux that is capable to do a number of useful things with the IPMI management board including a Cold Reset of it so it turns back to working state / adding users / grasping the System hardware and components information health status, changing the Listener address of the IPMI access Interface and even having ability to update the IPMI version firmware.

Prior to be able to access IPMI remotely it has to be enabled usually via a UTP cable connected to the Network from which you expect it to be accesible. The location of the IPMI port on different server vendors is different.

ibm-power9-server-ipmi

IBM Power 9 Server IPMI port

HP-ILO-Bladeserver-Management-port-MGMT-yellow-cabled

HP IPMI console called ILO (Integrated Lights-Out) Port cabled with yellow cable (usually labelled as
Management Port MGMT)

Supermicro-SSG-5029P-E1CTR12L-Rear-Annotated-dedicated-IPMI-lan-port

Supermicro server IPMI Dedicated Lan Port

 

 In this article I'll shortly explain how IPMITool is available and can be installed and used across GNU / Linux Debian / Ubuntu and other deb based Linuxes with apt or on Fedora / CentOS (RPM) based with yum etc.

 

1. Install IPMITool

 

– On Debian

 

# apt-get install –yes ipmitool 

 

– On CentOS

 

# yum install ipmitool OpenIPMI-tools

 

# ipmitool -V
ipmitool version 1.8.14

 

On CentOS ipmitool can run as a service and collect data and do some nice stuff to run it:

 

[root@linux ~]# chkconfig ipmi on 

 

[root@linux ~]# service ipmi start

 

Before start using it is worthy to give here short description from ipmitool man page
 

DESCRIPTION
       This program lets you manage Intelligent Platform Management Interface (IPMI) functions of either the local system, via a kernel device driver, or a remote system, using IPMI v1.5 and IPMI v2.0.
       These functions include printing FRU information, LAN configuration, sensor readings, and remote chassis power control.

IPMI management of a local system interface requires a compatible IPMI kernel driver to be installed and configured.  On Linux this driver is called OpenIPMI and it is included in standard  dis‐
       tributions.   On Solaris this driver is called BMC and is included in Solaris 10.  Management of a remote station requires the IPMI-over-LAN interface to be enabled and configured.  Depending on
       the particular requirements of each system it may be possible to enable the LAN interface using ipmitool over the system interface.

 

2. Get ADMIN IP configured for access

https://3.bp.blogspot.com/-jojgWqj7acg/Wo6bSP0Av1I/AAAAAAAAGdI/xaHewnmAujkprCiDXoBxV7uHonPFjtZDwCLcBGAs/s1600/22-02-2018%2B15-31-09

To get a list of what is the current listener IP with no access to above Web frontend via which IPMI can be accessed (if it is cabled to the Access / Admin LAN port).

 

# ipmitool lan print 1
Set in Progress         : Set Complete
Auth Type Support       : NONE MD2 MD5 PASSWORD
Auth Type Enable        : Callback : MD2 MD5 PASSWORD
                        : User     : MD2 MD5 PASSWORD
                        : Operator : MD2 MD5 PASSWORD
                        : Admin    : MD2 MD5 PASSWORD
                        : OEM      :
IP Address Source       : Static Address
IP Address              : 10.253.41.127
Subnet Mask             : 255.255.254.0
MAC Address             : 0c:c4:7a:4b:1f:70
SNMP Community String   : public
IP Header               : TTL=0x00 Flags=0x00 Precedence=0x00 TOS=0x00
BMC ARP Control         : ARP Responses Enabled, Gratuitous ARP Disabled
Default Gateway IP      : 10.253.41.254
Default Gateway MAC     : 00:00:0c:07:ac:7b
Backup Gateway IP       : 10.253.41.254
Backup Gateway MAC      : 00:00:00:00:00:00
802.1q VLAN ID          : 8
802.1q VLAN Priority    : 0
RMCP+ Cipher Suites     : 1,2,3,6,7,8,11,12
Cipher Suite Priv Max   : aaaaXXaaaXXaaXX
                        :     X=Cipher Suite Unused
                        :     c=CALLBACK
                        :     u=USER
                        :     o=OPERATOR
                        :     a=ADMIN
                        :     O=OEM

 

 

3. Configure custom access IP and gateway for IPMI

 

[root@linux ~]# ipmitool lan set 1 ipsrc static

 

[root@linux ~]# ipmitool lan set 1 ipaddr 192.168.1.211
Setting LAN IP Address to 192.168.1.211

 

[root@linux ~]# ipmitool lan set 1 netmask 255.255.255.0
Setting LAN Subnet Mask to 255.255.255.0

 

[root@linux ~]# ipmitool lan set 1 defgw ipaddr 192.168.1.254
Setting LAN Default Gateway IP to 192.168.1.254

 

[root@linux ~]# ipmitool lan set 1 defgw macaddr 00:0e:0c:aa:8e:13
Setting LAN Default Gateway MAC to 00:0e:0c:aa:8e:13

 

[root@linux ~]# ipmitool lan set 1 arp respond on
Enabling BMC-generated ARP responses

 

[root@linux ~]# ipmitool lan set 1 auth ADMIN MD5

[root@linux ~]# ipmitool lan set 1 access on

 

4. Getting a list of IPMI existing users

 

# ipmitool user list 1
ID  Name             Callin  Link Auth  IPMI Msg   Channel Priv Limit
2   admin1           false   false      true       ADMINISTRATOR
3   ovh_dontchange   true    false      true       ADMINISTRATOR
4   ro_dontchange    true    true       true       USER
6                    true    true       true       NO ACCESS
7                    true    true       true       NO ACCESS
8                    true    true       true       NO ACCESS
9                    true    true       true       NO ACCESS
10                   true    true       true       NO ACCESS


– To get summary of existing users

# ipmitool user summary
Maximum IDs         : 10
Enabled User Count  : 4
Fixed Name Count    : 2

5. Create new Admin username into IPMI board
 

[root@linux ~]# ipmitool user set name 2 Your-New-Username

 

[root@linux ~]# ipmitool user set password 2
Password for user 2: 
Password for user 2: 

 

[root@linux ~]# ipmitool channel setaccess 1 2 link=on ipmi=on callin=on privilege=4

 

[root@linux ~]# ipmitool user enable 2
[root@linux ~]# 

 

6. Configure non-privilege user into IPMI board

If a user should only be used for querying sensor data, a custom privilege level can be setup for that. This user then has no rights for activating or deactivating the server, for example. A user named monitor will be created for this in the following example:

[root@linux ~]# ipmitool user set name 3 monitor

 

[root@linux ~]# ipmitool user set password 3
Password for user 3: 
Password for user 3: 

 

[root@linux ~]# ipmitool channel setaccess 1 3 link=on ipmi=on callin=on privilege=2

 

[root@linux ~]# ipmitool user enable 3

The importance of the various privilege numbers will be displayed when  ipmitool channel  is called without any additional parameters.

 

 

[root@linux ~]# ipmitool channel
Channel Commands: authcap   <channel number> <max privilege>
                  getaccess <channel number> [user id]
                  setaccess <channel number> <user id> [callin=on|off] [ipmi=on|off] [link=on|off] [privilege=level]
                  info      [channel number]
                  getciphers <ipmi | sol> [channel]

 

Possible privilege levels are:
   1   Callback level
   2   User level
   3   Operator level
   4   Administrator level
   5   OEM Proprietary level
  15   No access
[root@linux ~]# 

The user just created (named 'monitor') has been assigned the USER privilege level. So that LAN access is allowed for this user, you must activate MD5 authentication for LAN access for this user group (USER privilege level).

[root@linux ~]# ipmitool channel getaccess 1 3
Maximum User IDs     : 15
Enabled User IDs     : 2

User ID              : 3
User Name            : monitor
Fixed Name           : No
Access Available     : call-in / callback
Link Authentication  : enabled
IPMI Messaging       : enabled
Privilege Level      : USER

[root@linux ~]# 

 

7. Check server firmware version on a server via IPMI

 

# ipmitool mc info
Device ID                 : 32
Device Revision           : 1
Firmware Revision         : 3.31
IPMI Version              : 2.0
Manufacturer ID           : 10876
Manufacturer Name         : Supermicro
Product ID                : 1579 (0x062b)
Product Name              : Unknown (0x62B)
Device Available          : yes
Provides Device SDRs      : no
Additional Device Support :
    Sensor Device
    SDR Repository Device
    SEL Device
    FRU Inventory Device
    IPMB Event Receiver
    IPMB Event Generator
    Chassis Device


ipmitool mc info is actually an alias for the ipmitool bmc info cmd.

8. Reset IPMI management controller or BMC if hanged

 

As earlier said if for some reason Web GUI access or SSH to IPMI is lost, reset with:

root@linux:/root#  ipmitool mc reset
[ warm | cold ]

 

If you want to stop electricity for a second to IPMI and bring it on use the cold reset (this usually
should be done if warm reset does not work).

 

root@linux:/root# ipmitool mc reset cold

 

otherwise soft / warm is with:

 

ipmitool mc reset warm

 

Sometimes the BMC component of IPMI hangs and only fix to restore access to server Remote board is to reset also BMC

 

root@linux:/root# ipmitool bmc reset cold

 

9. Print hardware system event log

 

root@linux:/root# ipmitool sel info
SEL Information
Version          : 1.5 (v1.5, v2 compliant)
Entries          : 0
Free Space       : 10240 bytes
Percent Used     : 0%
Last Add Time    : Not Available
Last Del Time    : 07/02/2015 17:22:34
Overflow         : false
Supported Cmds   : 'Reserve' 'Get Alloc Info'
# of Alloc Units : 512
Alloc Unit Size  : 20
# Free Units     : 512
Largest Free Blk : 512
Max Record Size  : 20

 

 ipmitool sel list
SEL has no entries

In this particular case the system shows no entres as it was run on a tiny Microtik 1U machine, however usually on most Dell PowerEdge / HP Proliant / Lenovo System X machines this will return plenty of messages.

ipmitool sel elist

ipmitool sel clear

To clear anything if such logged

ipmitool sel clear

 

10.  Print Field Replaceable Units ( FRUs ) on the server 

 

[root@linux ~]# ipmitool fru print
 

 

FRU Device Description : Builtin FRU Device (ID 0)
 Chassis Type          : Other
 Chassis Serial        : KD5V59B
 Chassis Extra         : c3903ebb6237363698cdbae3e991bbed
 Board Mfg Date        : Mon Sep 24 02:00:00 2012
 Board Mfg             : IBM
 Board Product         : System Board
 Board Serial          : XXXXXXXXXXX
 Board Part Number     : 00J6528
 Board Extra           : 00W2671
 Board Extra           : 1400
 Board Extra           : 0000
 Board Extra           : 5000
 Board Extra           : 10

 Product Manufacturer  : IBM
 Product Name          : System x3650 M4
 Product Part Number   : 1955B2G
 Product Serial        : KD7V59K
 Product Asset Tag     :

FRU Device Description : Power Supply 1 (ID 1)
 Board Mfg Date        : Mon Jan  1 01:00:00 1996
 Board Mfg             : ACBE
 Board Product         : IBM Designed Device
 Board Serial          : YK151127R1RN
 Board Part Number     : ZZZZZZZ
 Board Extra           : ZZZZZZ<FF><FF><FF><FF><FF>
 Board Extra           : 0200
 Board Extra           : 00
 Board Extra           : 0080
 Board Extra           : 1

FRU Device Description : Power Supply 2 (ID 2)
 Board Mfg Date        : Mon Jan  1 01:00:00 1996
 Board Mfg             : ACBE
 Board Product         : IBM Designed Device
 Board Serial          : YK131127M1LE
 Board Part Number     : ZZZZZ
 Board Extra           : ZZZZZ<FF><FF><FF><FF><FF>
 Board Extra           : 0200
 Board Extra           : 00
 Board Extra           : 0080
 Board Extra           : 1

FRU Device Description : DASD Backplane 1 (ID 3)
….

 

Worthy to mention here is some cheaper server vendors such as Trendmicro might show no data here (no idea whether this is a protocol incompitability or IPMItool issue).

 

11. Get output about system sensors Temperature / Fan / Power Supply

 

Most newer servers have sensors to track temperature / voltage / fanspeed peripherals temp overall system temp etc.
To get a full list of sensors statistics from IPMI 
 

# ipmitool sensor
CPU Temp         | 29.000     | degrees C  | ok    | 0.000     | 0.000     | 0.000     | 95.000    | 98.000    | 100.000
System Temp      | 40.000     | degrees C  | ok    | -9.000    | -7.000    | -5.000    | 80.000    | 85.000    | 90.000
Peripheral Temp  | 41.000     | degrees C  | ok    | -9.000    | -7.000    | -5.000    | 80.000    | 85.000    | 90.000
PCH Temp         | 56.000     | degrees C  | ok    | -11.000   | -8.000    | -5.000    | 90.000    | 95.000    | 100.000
FAN 1            | na         |            | na    | na        | na        | na        | na        | na        | na
FAN 2            | na         |            | na    | na        | na        | na        | na        | na        | na
FAN 3            | na         |            | na    | na        | na        | na        | na        | na        | na
FAN 4            | na         |            | na    | na        | na        | na        | na        | na        | na
FAN A            | na         |            | na    | na        | na        | na        | na        | na        | na
Vcore            | 0.824      | Volts      | ok    | 0.480     | 0.512     | 0.544     | 1.488     | 1.520     | 1.552
3.3VCC           | 3.296      | Volts      | ok    | 2.816     | 2.880     | 2.944     | 3.584     | 3.648     | 3.712
12V              | 12.137     | Volts      | ok    | 10.494    | 10.600    | 10.706    | 13.091    | 13.197    | 13.303
VDIMM            | 1.496      | Volts      | ok    | 1.152     | 1.216     | 1.280     | 1.760     | 1.776     | 1.792
5VCC             | 4.992      | Volts      | ok    | 4.096     | 4.320     | 4.576     | 5.344     | 5.600     | 5.632
CPU VTT          | 1.008      | Volts      | ok    | 0.872     | 0.896     | 0.920     | 1.344     | 1.368     | 1.392
VBAT             | 3.200      | Volts      | ok    | 2.816     | 2.880     | 2.944     | 3.584     | 3.648     | 3.712
VSB              | 3.328      | Volts      | ok    | 2.816     | 2.880     | 2.944     | 3.584     | 3.648     | 3.712
AVCC             | 3.312      | Volts      | ok    | 2.816     | 2.880     | 2.944     | 3.584     | 3.648     | 3.712
Chassis Intru    | 0x1        | discrete   | 0x0100| na        | na        | na        | na        | na        | na

 

To get only partial sensors data from the SDR (Sensor Data Repositry) entries and readings

 

[root@linux ~]# ipmitool sdr list 

Planar 3.3V      | 3.31 Volts        | ok
Planar 5V        | 5.06 Volts        | ok
Planar 12V       | 12.26 Volts       | ok
Planar VBAT      | 3.14 Volts        | ok
Avg Power        | 80 Watts          | ok
PCH Temp         | 45 degrees C      | ok
Ambient Temp     | 19 degrees C      | ok
PCI Riser 1 Temp | 25 degrees C      | ok
PCI Riser 2 Temp | no reading        | ns
Mezz Card Temp   | no reading        | ns
Fan 1A Tach      | 3071 RPM          | ok
Fan 1B Tach      | 2592 RPM          | ok
Fan 2A Tach      | 3145 RPM          | ok
Fan 2B Tach      | 2624 RPM          | ok
Fan 3A Tach      | 3108 RPM          | ok
Fan 3B Tach      | 2592 RPM          | ok
Fan 4A Tach      | no reading        | ns
Fan 4B Tach      | no reading        | ns
CPU1 VR Temp     | 27 degrees C      | ok
CPU2 VR Temp     | 27 degrees C      | ok
DIMM AB VR Temp  | 24 degrees C      | ok
DIMM CD VR Temp  | 23 degrees C      | ok
DIMM EF VR Temp  | 25 degrees C      | ok
DIMM GH VR Temp  | 24 degrees C      | ok
Host Power       | 0x00              | ok
IPMI Watchdog    | 0x00              | ok

 

[root@linux ~]# ipmitool sdr type Temperature
PCH Temp         | 31h | ok  | 45.1 | 45 degrees C
Ambient Temp     | 32h | ok  | 12.1 | 19 degrees C
PCI Riser 1 Temp | 3Ah | ok  | 16.1 | 25 degrees C
PCI Riser 2 Temp | 3Bh | ns  | 16.2 | No Reading
Mezz Card Temp   | 3Ch | ns  | 44.1 | No Reading
CPU1 VR Temp     | F7h | ok  | 20.1 | 27 degrees C
CPU2 VR Temp     | F8h | ok  | 20.2 | 27 degrees C
DIMM AB VR Temp  | F9h | ok  | 20.3 | 25 degrees C
DIMM CD VR Temp  | FAh | ok  | 20.4 | 23 degrees C
DIMM EF VR Temp  | FBh | ok  | 20.5 | 26 degrees C
DIMM GH VR Temp  | FCh | ok  | 20.6 | 24 degrees C
Ambient Status   | 8Eh | ok  | 12.1 |
CPU 1 OverTemp   | A0h | ok  |  3.1 | Transition to OK
CPU 2 OverTemp   | A1h | ok  |  3.2 | Transition to OK

 

[root@linux ~]# ipmitool sdr type Fan
Fan 1A Tach      | 40h | ok  | 29.1 | 3034 RPM
Fan 1B Tach      | 41h | ok  | 29.1 | 2592 RPM
Fan 2A Tach      | 42h | ok  | 29.2 | 3145 RPM
Fan 2B Tach      | 43h | ok  | 29.2 | 2624 RPM
Fan 3A Tach      | 44h | ok  | 29.3 | 3108 RPM
Fan 3B Tach      | 45h | ok  | 29.3 | 2592 RPM
Fan 4A Tach      | 46h | ns  | 29.4 | No Reading
Fan 4B Tach      | 47h | ns  | 29.4 | No Reading
PS 1 Fan Fault   | 73h | ok  | 10.1 | Transition to OK
PS 2 Fan Fault   | 74h | ok  | 10.2 | Transition to OK

 

[root@linux ~]# ipmitool sdr type ‘Power Supply’
Sensor Type "‘Power" not found.
Sensor Types:
        Temperature               (0x01)   Voltage                   (0x02)
        Current                   (0x03)   Fan                       (0x04)
        Physical Security         (0x05)   Platform Security         (0x06)
        Processor                 (0x07)   Power Supply              (0x08)
        Power Unit                (0x09)   Cooling Device            (0x0a)
        Other                     (0x0b)   Memory                    (0x0c)
        Drive Slot / Bay          (0x0d)   POST Memory Resize        (0x0e)
        System Firmwares          (0x0f)   Event Logging Disabled    (0x10)
        Watchdog1                 (0x11)   System Event              (0x12)
        Critical Interrupt        (0x13)   Button                    (0x14)
        Module / Board            (0x15)   Microcontroller           (0x16)
        Add-in Card               (0x17)   Chassis                   (0x18)
        Chip Set                  (0x19)   Other FRU                 (0x1a)
        Cable / Interconnect      (0x1b)   Terminator                (0x1c)
        System Boot Initiated     (0x1d)   Boot Error                (0x1e)
        OS Boot                   (0x1f)   OS Critical Stop          (0x20)
        Slot / Connector          (0x21)   System ACPI Power State   (0x22)
        Watchdog2                 (0x23)   Platform Alert            (0x24)
        Entity Presence           (0x25)   Monitor ASIC              (0x26)
        LAN                       (0x27)   Management Subsys Health  (0x28)
        Battery                   (0x29)   Session Audit             (0x2a)
        Version Change            (0x2b)   FRU State                 (0x2c)

 

12. Using System Chassis to initiate power on / off / reset / soft shutdown

 

!!!!!  Beware only run this if you know what you're realling doing don't just paste into a production system, If you do so it is your responsibility !!!!! 

–  do a soft-shutdown via acpi 

 

ipmitool [chassis] power soft

 

– issue a hard power off, wait 1s, power on 

 

ipmitool [chassis] power cycle

 

– run a hard power off

 

ipmitool [chassis] power off

 
– do a hard power on 

 

ipmitool [chassis] power on

 

–  issue a hard reset

 

ipmitool [chassis] power reset


– Get system power status
 

ipmitool chassis power status

 

13. Use IPMI (SoL) Serial over Lan to execute commands remotely


Besides using ipmitool locally on server that had its IPMI / ILO / DRAC console disabled it could be used also to query and make server do stuff remotely.

If not loaded you will have to load lanplus kernel module.
 

modprobe lanplus

 

 ipmitool -I lanplus -H 192.168.99.1 -U user -P pass chassis power status

ipmitool -I lanplus -H 192.168.98.1 -U user -P pass chassis power status

ipmitool -I lanplus -H 192.168.98.1 -U user -P pass chassis power reset

ipmitool -I lanplus -H 192.168.98.1 -U user -P pass chassis power reset

ipmitool -I lanplus -H 192.168.98.1 -U user -P pass password sol activate

– Deactivating Sol server capabilities
 

 ipmitool -I lanplus -H 192.168.99.1 -U user -P pass sol deactivate

 

14. Modify boot device order on next boot

 

!!!!! Do not run this except you want to really modify Boot device order, carelessly copy pasting could leave your server unbootable on next boot !!!!!

– Set first boot device to be as BIOS

ipmitool chassis bootdev bios

 

– Set first boot device to be CD Drive

ipmitool chassis bootdev cdrom 

 

– Set first boot device to be via Network Boot PXE protocol

ipmitool chassis bootdev pxe 

 

15. Using ipmitool shell

 

root@iqtestfb:~# ipmitool shell
ipmitool> 
help
Commands:
        raw           Send a RAW IPMI request and print response
        i2c           Send an I2C Master Write-Read command and print response
        spd           Print SPD info from remote I2C device
        lan           Configure LAN Channels
        chassis       Get chassis status and set power state
        power         Shortcut to chassis power commands
        event         Send pre-defined events to MC
        mc            Management Controller status and global enables
        sdr           Print Sensor Data Repository entries and readings
        sensor        Print detailed sensor information
        fru           Print built-in FRU and scan SDR for FRU locators
        gendev        Read/Write Device associated with Generic Device locators sdr
        sel           Print System Event Log (SEL)
        pef           Configure Platform Event Filtering (PEF)
        sol           Configure and connect IPMIv2.0 Serial-over-LAN
        tsol          Configure and connect with Tyan IPMIv1.5 Serial-over-LAN
        isol          Configure IPMIv1.5 Serial-over-LAN
        user          Configure Management Controller users
        channel       Configure Management Controller channels
        session       Print session information
        dcmi          Data Center Management Interface
        sunoem        OEM Commands for Sun servers
        kontronoem    OEM Commands for Kontron devices
        picmg         Run a PICMG/ATCA extended cmd
        fwum          Update IPMC using Kontron OEM Firmware Update Manager
        firewall      Configure Firmware Firewall
        delloem       OEM Commands for Dell systems
        shell         Launch interactive IPMI shell
        exec          Run list of commands from file
        set           Set runtime variable for shell and exec
        hpm           Update HPM components using PICMG HPM.1 file
        ekanalyzer    run FRU-Ekeying analyzer using FRU files
        ime           Update Intel Manageability Engine Firmware
ipmitool>

 

16. Changing BMC / DRAC time setting

 

# ipmitool -H XXX.XXX.XXX.XXX -U root -P pass sel time set "01/21/2011 16:20:44"

 

17. Loading script of IPMI commands

# ipmitool exec /path-to-script/script-with-instructions.txt  

 

Closure

As you saw ipmitool can be used to do plenty of cool things both locally or remotely on a server that had IPMI server interface available. The tool is mega useful in case if ILO console gets hanged as it can be used to reset it.
I explained shortly what is Intelligent Platform Management Interface, how it can be accessed and used on Linux via ipmitool. I went through some of its basic use, how it can be used to print the configured ILO access IP how
this Admin IP and Network configuration can be changed, how to print the IPMI existing users and how to add new Admin and non-privileged users.
Then I've shown how a system hardware and firmware could be shown, how IPMI management BMC could be reset in case if it hanging and how hardware system even logs can be printed (useful in case of hardware failure errors etc.), how to print reports on current system fan / power supply  and temperature. Finally explained how server chassis could be used for soft and cold server reboots locally or via SoL (Serial Over Lan) and how boot order of system could be modified.

ipmitool is a great tool to further automate different sysadmin tasks with shell scrpts for stuff such as tracking servers for a failing hardware and auto-reboot of inacessible failed servers to guarantee Higher Level of availability.
Hope you enjoyed artcle .. It wll be interested to hear of any other known ipmitool scripts or use, if you know such please share it.

How to Set MySQL MariaDB server root user to be able to connect from any host on the Internet / Solution to ‘ ERROR 1045 (28000): Access denied for user ‘root’@’localhost’ (using password: YES) ‘

Tuesday, September 3rd, 2019

How-to-set-up-MariaDB-server-root-admin-user-to-be-able-to-connect-from-any-host-anywhere-mariadb-seal-logo-picture

In this small article, I'll shortly explain on how I setup a Standard default package MariaDB Database server on Debian 10 Buster Linux and how I configured it to be accessible from any hostname on the Internet in order to make connection from remote Developer PC with MySQL GUI SQL administration tools such as MySQL WorkBench / HeidiSQL / Navicat / dbForge   as well as the few set-backs experienced in the process (e.g. what was the reason for ' ERROR 1045 (28000): Access denied for user 'root'@'localhost' (using password: YES) '  error and its solution.

Setting new or changing old MariaDB (MySQL) root server password

 

I've setup a brand new MariaDB database (The new free OpenSource software fork of MySQL) mariadb-server-10.3 on a Debian 10, right after the OS was installed with the usual apt command:

# apt install mariadb-server

Next tep was to change the root access password which was set to empty pass by default, e.g. connected with mysql CLI locally while logged via SSH on server and run:

MariaDB [(none)]> mysql -u root -p

use mysql;
update user set authentication_string=PASSWORD("MyChosenNewPassword") where User='root';

There was requirement by the customer, that MySQL server is not only accessed locally but be accessed from any IP address from anywhere on the Internet, so next step was to do so.

Allowing access to MySQL server from Anywhere

Allowing access from any host to MariaDB SQL server  is a bad security practice but as the customer is the King I've fulfilled this weird wish too, by changing the listener for MariaDB (MySQL) on Debian 10 codenamed Buster
 
changing the default listener
to be not the default 127.0.0.1 (localhost) but any listener is done by modifying the bind-address directive in conf /etc/mysql/mariadb.conf.d/50-server.cnf:

root@linux:~# vim /etc/mysql/mariadb.conf.d/50-server.cnf

Then comment out

bind-address  = 127.0.0.1

and  add instead 0.0.0.0 (any listener)

 

bind-address  = 0.0.0.0
root@linux:/etc/mysql/mariadb.conf.d# grep -i bind-address 50-server.cnf
##bind-address            = 127.0.0.1
bind-address    = 0.0.0.0


Then to make the new change effective restart MariaDB (luckily still using the old systemV init script even though systemd is working.
 

root@linux:~# /etc/init.d/mysql restart
[ ok ] Restarting mysql (via systemctl): mysql.service.


To make sure it is properly listening on MySQL defaults TCP port 3306, then as usual used netcat.

root@pritchi:~# netstat -etna |grep -i 3306
tcp        0      0 0.0.0.0:3306            0.0.0.0:*               LISTEN      109        1479917  

 

By the way the exact mariadb.cnf used on this middle-sized front-backend server is here – the serveris planned to be a Apache Web server + Database host with MySQL DB of a middle range to be able to serve few thousand of simultaneous unique customers.

To make sure no firewall is preventing MariaDB to be accessed, I've checked for any reject rules iptables and ipset definitions, e.g.:
 

root@linux:~# iptables -L |gre -i rej

root@linux:~# ipset list

 

Then to double make sure the MySQL is allowed to access from anywhere, used simple telnet from my Desktop Laptop PC (that also runs Debian Linux) towards the server .

hipo@jeremiah:~$ telnet 52.88.235.45 3306
Trying 52.88.235.45…
Connected to 52.88.235.45.
Escape character is '^]'.
[
5.5.5-10.3.15-MariaDB-1
                       rQ}Cs>v\��-��G1[W%O>+Y^OQmysql_native_password
Connection closed by foreign host.

 

As telnet is not supporting the data encryption after TCP proto connect, in a few seconds time, remote server connection is terminated.

 

Setting MySQL user to be able to connect to local server MySQL from any remote hostname


I've connected locally to MariaDB server with mysql -u root -p and issued following set of SQL commands to make MySQL root user be able to connect from anywhere:

 

CREATE USER 'root'@'%' IDENTIFIED BY 'my-secret-pass';
GRANT ALL ON *.* TO 'root'@'localhost';
GRANT ALL ON *.* TO 'root'@'%';

 

Next step, I've took was to try logging in with root (admin) MariaDB superuser from MySQL CLI (Command Line Interface) on my desktop just to find out, I'm facing a nasty error.
 

hipo@jeremiah:~$ mysql -u root -H remote-server-hostname.com -p
Enter password:
ERROR 1045 (28000): Access denied for user 'root'@'localhost' (using password: YES)


My first guess was something is wrong with my root user created in MySQL's mysql.user table (In MySQL this is the privileges table that stores, how MySQL user credentials are handled by mysqld local OS running process.

 

Changing the MySQL root (admin) password no longer possible on Debian 10 Buster?

 

The standard way ot change the MySQL root password well known via a simple dpkg-reconfigure (provided by Debian's debconf is no longer working so below command produces empty output instead of triggering the good old Ncurses text based interface well-known over the years …

 

root@linux:~# /usr/sbin/dpkg-reconfigure mariadb-server-10.3

 

 

Viewing MariaDB (MySQL) username / password set-up from the CLI

 

To list how this set-privileges looked like I've used following command:

 

MariaDB [mysql]> select * from mysql.user where User = 'root';
+———–+——+——————————————-+————-+————-+————-+————-+————-+———–+————-+—————+————–+———–+————+—————–+————+————+————–+————+———————–+——————+————–+—————–+——————+——————+—————-+———————+——————–+——————+————+————–+————————+———————+———-+————+————-+————–+—————+————-+—————–+———————-+———————–+———————–+——————+———+————–+——————–+
| Host      | User | Password                                  | Select_priv | Insert_priv | Update_priv | Delete_priv | Create_priv | Drop_priv | Reload_priv | Shutdown_priv | Process_priv | File_priv | Grant_priv | References_priv | Index_priv | Alter_priv | Show_db_priv | Super_priv | Create_tmp_table_priv | Lock_tables_priv | Execute_priv | Repl_slave_priv | Repl_client_priv | Create_view_priv | Show_view_priv | Create_routine_priv | Alter_routine_priv | Create_user_priv | Event_priv | Trigger_priv | Create_tablespace_priv | Delete_history_priv | ssl_type | ssl_cipher | x509_issuer | x509_subject | max_questions | max_updates | max_connections | max_user_connections | plugin                | authentication_string | password_expired | is_role | default_role | max_statement_time |
+———–+——+——————————————-+————-+————-+————-+————-+————-+———–+————-+—————+————–+———–+————+—————–+————+————+————–+————+———————–+——————+————–+—————–+——————+——————+—————-+———————+——————–+——————+————+————–+————————+———————+———-+————+————-+————–+—————+————-+—————–+———————-+———————–+———————–+——————+———+————–+——————–+
| localhost | root | *E6D338325F50177F2F6A15EDZE932D68C88B8C4F | Y           | Y           | Y           | Y           | Y           | Y         | Y           | Y             | Y            | Y         | Y          | Y               | Y          | Y          | Y            | Y          | Y                     | Y                | Y            | Y               | Y                | Y                | Y              | Y                   | Y                  | Y                | Y          | Y            | Y                      | Y                   |          |            |             |              |             0 |           0 |               0 |                    0 | mysql_native_password |                       | N                | N       |              |           0.000000 |
| %         | root | *E6D338325F50177F2F6A15EDZE932D68C88B8C4F | Y           | Y           | Y           | Y           | Y           | Y         | Y           | Y             | Y            | Y         | N          | Y               | Y          | Y          | Y            | Y          | Y                     | Y                | Y            | Y               | Y                | Y                | Y              | Y                   | Y                  | Y                | Y          | Y            | Y                      | Y                   |          |            |             |              |             0 |           0 |               0 |                    0 |                       |                       | N                | N       |              |           0.000000 |
+———–+——+——————————————-+————-+————-+————-+————-+————-+———–+————-+—————+————–+———–+————+—————–+————+————+————–+————+———————–+——————+————–+—————–+——————+——————+—————-+———————+——————–+——————+————+————–+————————+———————+———-+————+————-+————–+—————+————-+—————–+———————-+———————–+———————–+——————+———+————–+——————–+

 

The hashed (encrypted) password string is being changed from the one on the server, so please don't try to hack me (decrypt it) 🙂
As it is visible from below output the Host field for root has the '%' string which means, any hostname is authorized to be able to connect and login to the MySQL server, so this was not the problem.

After quite some time on reading on what causes
' ERROR 1045 (28000): Access denied for user 'root'@'localhost' (using password: YES)
',
I've spend some time reading various forum discussions online on the err such as the one on StackOverflow here SuperUser.com's  how to fix access denied for user 'root'@'localhost' and one on askubuntu.com's – ERROR 1045(28000) : Access denied for user 'root@localhost' (using password: no ) and after a while finally got it, thanks to a cool IRC.FREENODE.NET guy nicknamed, hedenface who pointed me I'm that, I'm trying to use the -H flag (Prodice HTML) instead of -h (host_name), it seems somehow I ended up with the wrong memory that the -H stands for hostname, by simply using -h I could again login Hooray!!!

 

root@linux:~$ mysql -u root -h remote-server-host.com -p
Enter password:
Welcome to the MariaDB monitor.  Commands end with ; or \g.
Your MariaDB connection id is 14
Server version: 10.3.15-MariaDB-1 Debian 10

 

Copyright (c) 2000, 2018, Oracle, MariaDB Corporation Ab and others.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.


I've further asked the customer to confirm, he can connect also from his Microsoft Windows 10 PC situated on a different LAN network and got his confirmation. Few notes to make here is I've also installed phpmyadmin on the server using phpmyadmin php source code latest version, as in Debian 10 it seems the good old PHP is no longer available (as this crazy developers again made a mess and there is no phpmyadmin .deb package in Debian Buster – but that's a different story I'll perhaps try to document in some small article in future.