Archive for the ‘Linux’ Category

Stop SSH Bruteforce authentication attempt Attacks with fail2ban

Monday, July 6th, 2020

Fail2ban stop restrict ssh bruteforce authentication attempt attacks

Most of webmasters today have some kind of SSH console remote access to the server and the OpenSSH Secure Shell service is usually not filtered for specific Networks but fully accessbible on the internet. This is especially true for home brew Linux Web servers as well as small to mid sized websites and blogs hosted on a cheap dedicated servers hosted in UK2 / Contabo RackSpace etc.

Brute force password guess attack tools such as Hydra and a distributed password dictionary files have been circulating quite for a while and if the attacker has enough time as well as a solid dictionary base, as well as some kind of relatively weak password you can expect that sooner or later some of the local UNIX accounts can be breaked and the script kiddie can get access to your server and make quickly a havoc, if he is lucky enough to be able to exploit some local vulnerability and get root access …

If you're a sysadmin that has to manage the Linux server and you do a routine log reading on the machine, you will soon get annoyed of the ever growing amount of different users, that are trying to login unsucessfully to the SSH (TCP port 22) service filling up the logs with junk and filling up disk space for nothing as well as consuming some CPU and Memory resources for nothing, you will need some easy  solution to make brute force attacks from an IP get filtered after few unsuccessful login attempts.
 

The common way to protect SSH would then be to ban an IP address from logging in if there are too many failed login attempts based on an automatic firewall inclusion of any IP that tried to unsuccessfully login lets say 5 or 10 reoccuring times.

In Linux there is a toll called “fail2ban” (F2B) used to limit brute force authentication attempts.

F2B works with minimal configuration and besides being capable of protecting the SSH service, it can be set to protect a lot of other Server applications like;

 Apache / NGINX Web Servers with PHP / Mail Servers (Exim, Postfix, Qmail, Sendmail), POP3 IMAP / AUTH services (Dovecot, Courier, Cyrus), DNS Bind servers, MySQL DBs, Monitoring tools such as Nagios, FTP servers (ProFTPD / PureFTP), Proxy servers (Squid), WordPress sites (wp-login) brute force attacks, Web Mail services (Horde / Roundcube / OpenWebmail), jabber servers etc.

To get a better overview below is F2B package description:
 

linux:~# apt-cache show fail2ban|grep -i description-en -A 21
Description-en: ban hosts that cause multiple authentication errors
 Fail2ban monitors log files (e.g. /var/log/auth.log,
 /var/log/apache/access.log) and temporarily or persistently bans
 failure-prone addresses by updating existing firewall rules.  Fail2ban
 allows easy specification of different actions to be taken such as to ban
 an IP using iptables or hostsdeny rules, or simply to send a notification
 email.
 .
 By default, it comes with filter expressions for various services
 (sshd, apache, qmail, proftpd, sasl etc.) but configuration can be
 easily extended for monitoring any other text file.  All filters and
 actions are given in the config files, thus fail2ban can be adopted
 to be used with a variety of files and firewalls.  Following recommends
 are listed:
 .
  – iptables/nftables — default installation uses iptables for banning.
    nftables is also suported. You most probably need it
  – whois — used by a number of *mail-whois* actions to send notification
    emails with whois information about attacker hosts. Unless you will use
    those you don't need whois
  – python3-pyinotify — unless you monitor services logs via systemd, you
    need pyinotify for efficient monitoring for log files changes

 

Using fail2ban is easy as there is a multitude of preexisting filters and actions (that gets triggered on a filter match) already written by different people usually found in /etc/fail2ban/filter.d an /etc/fail2ban/action.d.
as well as custom action / filter / jails is easy to do.
Fail2ban is available as a standard distro RPM / DEB package
on most modern versions of Ubuntu (16.04 and later), Debian, Mint and CentOS 7, OpenSuSE, Fedora etc.
 

1. Install Fail2ban package


– On Deb based distros do the usual:

 

linux:~# apt update
linux:~# apt install –yes fail2ban


On RPM distros Fedora / CentOS / SuSE etc.
 

linux:~# sudo yum -y install epel-release
linux:~# sudo yum -y install fail2ban

 

2. Enable fail2ban ban rules for SSH failed authentication filtering

 

Create the file /etc/fail2ban/jail.local :
 

# cat > /etc/fail2ban/jail.local


Paste below content:
 

[DEFAULT]
# Ban hosts for one hour:
bantime = 360
# Override /etc/fail2ban/jail.d/00-firewalld.conf:
banaction = iptables-multiport

 

[sshd]
enabled = true
maxretry = 10
findtime = 43200
# ban for 1 day
bantime = 3600
# ban for 1 day
#bantime = 86400

 

This config will ban for 1 day any IP that tries to access more than 10 unsuccesful times sshd daemon.
This works through IPTABLES indicated by config banaction = iptables-multiport config making fail2ban to automatically add a new iptables block rule valid for 1 day.

To close the file press CTRL + D simulanesly

  • maxretry controls the maximum number of allowed retries.
  • findtime specifies the time window (in seconds) which should be considered for banning an IP. (43200 seconds is 12 hours)
  • bantime specifies the time window (in seconds) for which the IP address will be banned (86400 seconds is 24 hours).

If SSH listens to a different port from 22 on the machine, you can specify the port number with port = <port_number> in this file.

 

3. Start fail2ban service


Either use the /etc/init.d/fail2ban start script or systemd systemctl
 

linux:~# systemctl enable fail2ban
linux:~# systemctl restart fail2ban

 

4. Fail2ban operational principle in short


Fail2ban uses Filters, Actions And Jails:

Filtersspecify certain patterns of text that Fail2ban should recognize in log files.
Actionsare things Fail2ban can do once a filter is matched.
Jailstell Fail2ban to match a filter on some logs. When the number of matches goes beyond a certain limit specified in the jail, Fail2ban takes an action specified in the jail.

If still wonder what is Fail2ban jail? Each configured jail tells fail2ban to look at system logs and take actions against attacks on a configured service, in our case OpenSSH service.
 

5. Blocking repeated unsuccessful password authentication attempts for longer periods


If more than number of failed ssh logins happen to occur (lets say 35 reoccuring ones in /var/log/auth.log (the debian failed ssh login file) or /var/log/secure (redhat distros failed ssh log file).
You will perhaps want to permanently block this IP for 3 days or so, here is how:
 

banaction = iptables-multiport
maxretry  = 35
findtime  = 259200
bantime   = 608400
enabled   = true
filter    = sshd

 

Add this to [sshlongterm] section to make it finally look like this:
 

[sshlongterm]
port      = ssh
logpath   = %(sshd_log)s
banaction = iptables-multiport
maxretry  = 35
findtime  = 259200
bantime   = 608400
enabled   = true
filter    = sshd


Of course to load new config restart fail2ban
 

# /etc/init.d/fail2ban restart


Default jail as well as the sshlongterm jail should now work together. Short term attacks will be handled by the default jail under [sshd], and the long term attacks handled by our own second jail [sshlongterm].
 

6. Checking which intruders were blocked by fail2ban


Fail2ban creates a separate chain f2b-sshd to which it adds each blocked IP for the period of time preset in the config, to list it:

linux:~# /sbin/iptables -L f2b-sshd
Chain f2b-sshd (1 references)
target     prot opt source               destination         
REJECT     all  —  165.22.2.95          anywhere             reject-with icmp-port-unreachable
REJECT     all  —  ec2-13-250-58-46.ap-southeast-1.compute.amazonaws.com  anywhere             reject-with icmp-port-unreachable
REJECT     all  —  cable200-116-3-133.epm.net.co  anywhere             reject-with icmp-port-unreachable
REJECT     all  —  mail.antracite.org   anywhere             reject-with icmp-port-unreachable
REJECT     all  —  139.59.57.2          anywhere             reject-with icmp-port-unreachable
REJECT     all  —  111.68.98.152.pern.pk  anywhere             reject-with icmp-port-unreachable
REJECT     all  —  216.126.59.61        anywhere             reject-with icmp-port-unreachable
REJECT     all  —  104.248.4.138        anywhere             reject-with icmp-port-unreachable
REJECT     all  —  210.211.119.10       anywhere             reject-with icmp-port-unreachable
REJECT     all  —  85.204.118.13        anywhere             reject-with icmp-port-unreachable
REJECT     all  —  broadband.actcorp.in  anywhere             reject-with icmp-port-unreachable
REJECT     all  —  104.236.33.155       anywhere             reject-with icmp-port-unreachable
REJECT     all  —  114.ip-167-114-114.net  anywhere             reject-with icmp-port-unreachable
REJECT     all  —  142.162.194.168.rfc6598.dynamic.copelfibra.com.br  anywhere             reject-with icmp-port-unreachable
REJECT     all  —  server.domain.cl     anywhere             reject-with icmp-port-unreachable
REJECT     all  —  ip202.ip-51-68-251.eu  anywhere             reject-with icmp-port-unreachable
REJECT     all  —  188.166.185.157      anywhere             reject-with icmp-port-unreachable
REJECT     all  —  93.49.11.206         anywhere             reject-with icmp-port-unreachable
RETURN     all  —  anywhere             anywhere            

 

Conclusion


What we have seen here is how to make fail2ban protect Internet firewall unrestricted SSHD Service to filter out 1337 skript kiddie 'hackers' out of your machine. With a bit of tuning could not only break the occasional SSH Brute Force bot scanners that craw the new but could even mitigate massive big botnet initiated brute force attacks to servers.
Of course fail2ban is not a panacea and to make sure you won't get hacked one days better make sure to only allow access to SSH service only for a certain IP addresses or IP address ranges that are of your own PCs.

Make Laptop Sleep on LID (Monitor) close in Linux Debian and Ubuntu systemd Linux

Monday, June 22nd, 2020

make-laptop-auto-sleep-on-lid-close-in-Linux-Ubuntu-Debian-Linux

 

 

I need to make my laptop automatically sleep on LID Screen close but it doesn't why?

If have used your laptop for long years with Windows or any Windows user is used to the default beavrior of Windows to automatically sleep the computer on PC close. This default behavior of automatically sleep on LID Close has been Windows standard for many years
and the reason behind that usually laptop is used for mobility and working on a discharging battery so a LID screen close puts the laptop in (SLEEP) BATTERY SUSPEND MODE aiming to make the charged battery last longer. However often for Desktop use in the Office LID close 
trigger of laptop sleep mode is annoying and undesired I've blogged earlier on that issue and how to make laptop not to sleep on LID close on M$ Windows 10 here.

This bahavior was copied and was working in many of the Linux distributions for years however in Debian GNU / Linux and Ubuntu 16.X this feature is often not properly working due to a systemd bug. Of course closing the notebook LID screen without putting
the PC in sleep mode is not a bug but a very useful feature for those who use their laptop as a Desktop machine that is non-stop running, however for most ppl default behavior to auto-suspend the computer on Laptop Monitor close is desired.

Here is how to  force the close of the laptop lid to go to suspend/sleep mode and when open the lid, it wake it up.
 

 

1. First requirement is to make sure the laptop has installed the package pm-utils, if it is not there install it with:

 

# apt-get install –yes pm-utils

 

2. Next we need to edit logind.conf and append 3 variables

 

# vim /etc/systemd/logind.conf


Normally the file should have a bit of commented informative lines as well as a commented variables that could be enabled like so:

 

[Login]
#NAutoVTs=6
#ReserveVT=6
#KillUserProcesses=no
#KillOnlyUsers=
#KillExcludeUsers=root
#InhibitDelayMaxSec=5
#HandlePowerKey=poweroff
#HandleSuspendKey=suspend
#HandleHibernateKey=hibernate
#HandleLidSwitch=suspend
#HandleLidSwitchExternalPower=suspend
#HandleLidSwitchDocked=ignore
#PowerKeyIgnoreInhibited=no
#SuspendKeyIgnoreInhibited=no
#HibernateKeyIgnoreInhibited=no
#LidSwitchIgnoreInhibited=yes
#HoldoffTimeoutSec=30s
#IdleAction=ignore
#IdleActionSec=30min
#RuntimeDirectorySize=10%
#RemoveIPC=yes
#InhibitorsMax=8192
#SessionsMax=8192


These entries are usually the files that are used by default as a systemd settings.
Before starting make a copy just you happen to mess systemd.conf, e.g.:

 

cp -rpf /etc/systemd/logind.conf /etc/systemd/logind.conf_bak


To make the PC LID close active append in the end of file below 3 lines:

 

HandleSuspendKey=suspend
HandleLidSwitch=suspend
HandleLidSwitchDocked=suspend

 

systemd-logind-conf-enable-suspend-sleep-on-laptop-lid-screen-close-linux

Save the file and to make systemd daemon reload restart the PC, even though theoretically systemd can be reloaded to digest its new /etc/systemd/logind.conf with:

 

# systemctl daemon-reexec

 

3. Assure yourself the Power Management LID setting of the Desktop Graphical User Interface are set to SUSPEND on close


I use MATE Desktop environment as it is simplistic and quite stable fork of GNOME 2.0, anyway depending on the GUI used on the Linux powered laptop e.g. GNOME / KDE Plasma / XFce etc. make sure the respective
 

Control Panel -> Power Management


settings are set to Force the Laptop Screen LID SUSPEND on Close.

Below is how this is done on MATE:

power-management-preferences-when-lid-is-closed-MATE-on-AC-power

power-management-preferences-when-lid-closed-on-battery-suspend

That's all folks, now close your Laptop and enjoy it going to sleep, open it up and get it awaked 🙂 Cheers ! 

 

Sysadmin tip: How to force a new Linux user account password change after logging to improve security

Thursday, June 18th, 2020

chage-linux-force-password-expiry-check-user-password-expiry-setting

Have you logged in through SSH to remote servers with the brand new given UNIX account in your company just to be prompted for your current Password immediately after logging and forced to change your password?
The smart sysadmins or security officers use this trick for many years to make sure the default set password for new user is set to a smarter user to prevent default password leaks which might later impose a severe security risk for a company Demiliterized networks confidential data etc.

If you haven't seen it yet and you're in the beautiful world of UNIX / Linux as a developer qa tester or sysadmin sooner or later you will face it.
Here of course I'm talking about plain password local account authentication using user / pass credentials stored in /etc/passwd or /etc/shadow.

Lets Say hello to the main command chage that is used to do this sysadmin trick.
chage command is used to change user password expiry information and  set and alter password aging parameters on user accounts.

 

1. Force chage to make password expire on next user login for a new created user
 

# chage -d 0 {user-name} 


Below is a real life example
 

chage-force-user-account-password-expiry-linux

 

2. Get information on when account expires

 

[hipo@linux ~]$ chage -l hipo
Last password change                                    : Apr 03, 2020
Password expires                                        : Jul 08, 2020
Password inactive                                       : never
Account expires                                         : never
Minimum number of days between password change          : 0
Maximum number of days between password change          : 90
Number of days of warning before password expires       : 14

 

3. Use chage to set user account password expiration

The most straight forward way to set an expiration date for an active user acct is with:

 

# chage -E 2020-08-16 username


To make the account get locked automatically if the password has expired and the user did not logged in to it for 2 days after its expiration.

# chage -I 2 username


– Set Password expire with Minimum days 7 (-n mindays 7), (-x maxdays 28) and (-w warndays 5)

# passwd -n 7 -x 28 -w 5 username

To check the passwod expiration settings use list command:

# chage -l username
Last password change                                    : юни 18, 2020
Password expires                                        : юли 16, 2020
Password inactive                                       : never
Account expires                                         : never
Minimum number of days between password change          : 7
Maximum number of days between password change          : 28
Number of days of warning before password expires       : 5

 

chage is a command is essential sysadmin command that is mentioned in every Learn Linux book out there, however due to its often rare used many people and sysadmins either, don't know it or learn of it only once it is needed. 
A note to make here is some sysadmins prefer to use usermod to set a password expire instead of chage.

usermod -e 2020-10-14 username

 

For those who wonder how to set password expiry on FreeBSD and other BSD-es is done, there it is done via the pw system user management tool as chage is not present there.

 

A note to make here is chage usually does not provide information for Linux user accounts that are stored in LDAP. To get information of such you can use ldapsearch with a query to the LDAP domain store with something like.
 

ldapsearch -x -ZZ -LLL -b dc=domain.com,dc=com objectClass=*


It is worthy to mention also another useful command when managing users this is getent used to get entries from Name Service Switch libraries. 
getent is useful to get various information from basic /etc/ stored db files such as /etc/services /etc/shadow, /etc/group, /etc/aliases, /etc/hosts and even do some simple rpc queries.

Report haproxy node switch script useful for Zabbix or other monitoring

Tuesday, June 9th, 2020

zabbix-monitoring-logo
For those who administer corosync clustered haproxy and needs to build monitoring in case if the main configured Haproxy node in the cluster is changed, I've developed a small script to be integrated with zabbix-agent installed to report to a central zabbix server via a zabbix proxy.
The script  is very simple it assumed DC1 variable is the default used haproxy node and DC2 and DC3 are 2 backup nodes. The script is made to use crm_mon which is not installed by default on each server by default so if you'll be using it you'll have to install it first, but anyways the script can easily be adapted to use pcs cmd instead.

Below is the bash shell script:

UserParameter=active.dc,f=0; for i in $(sudo /usr/sbin/crm_mon -n -1|grep -i 'Node ' |awk '{ print $2 }'); do ((f++)); DC[$f]="$i"; done; \
DC=$(sudo /usr/sbin/crm_mon -n -1 | grep 'Current DC' | awk '{ print $1 " " $2 " " $3}' | awk '{ print $3 }'); \
if [ “$DC” == “${DC[1]}” ]; then echo “1 Default DC Switched to ${DC[1]}”; elif [ “$DC” == “${DC[2]}” ]; then \
echo "2 Default DC Switched to ${DC[2]}”; elif [ “$DC” == “${DC[3]}” ]; then echo “3 Default DC: ${DC[3]}"; fi


To configure it with zabbix monitoring it can be configured via UserParameterScript.

The way I configured  it in Zabbix is as so:


1. Create the userpameter_active_node.conf

Below script is 3 nodes Haproxy cluster

# cat > /etc/zabbix/zabbix_agentd.d/userparameter_active_node.conf

UserParameter=active.dc,f=0; for i in $(sudo /usr/sbin/crm_mon -n -1|grep -i 'Node ' |awk '{ print $2 }'); do ((f++)); DC[$f]="$i"; done; \
DC=$(sudo /usr/sbin/crm_mon -n -1 | grep 'Current DC' | awk '{ print $1 " " $2 " " $3}' | awk '{ print $3 }'); \
if [ “$DC” == “${DC[1]}” ]; then echo “1 Default DC Switched to ${DC[1]}”; elif [ “$DC” == “${DC[2]}” ]; then \
echo "2 Default DC Switched to ${DC[2]}”; elif [ “$DC” == “${DC[3]}” ]; then echo “3 Default DC: ${DC[3]}"; fi

Once pasted to save the file press CTRL + D


The version of the script with 2 nodes slightly improved is like so:
 

UserParameter=active.dc,f=0; for i in $(sudo /usr/sbin/crm_mon -n -1|grep -i 'Node ' |awk '{ print $2 }' | sed -e 's#:##g'); do DC_ARRAY[$f]=”$i”; ((f++)); done; GET_CURR_DC=$(sudo /usr/sbin/crm_mon -n -1 | grep ‘Current DC’ | awk ‘{ print $1 ” ” $2 ” ” $3}’ | awk ‘{ print $3 }’); if [ “$GET_CURR_DC” == “${DC_ARRAY[0]}” ]; then echo “1 Default DC ${DC_ARRAY[0]}”; fi; if [ “$GET_CURR_DC” == “${DC_ARRAY[1]}” ]; then echo “2 Default Current DC Switched to ${DC_ARRAY[1]} Please check “; fi; if [ -z “$GET_CURR_DC” ] || [ -z “$DC_ARRAY[1]” ]; then printf "Error something might be wrong with HAProxy Cluster on  $HOSTNAME "; fi;


The haproxy_active_DC_zabbix.sh script with a bit of more comments as explanations is available here 
2. Configure access for /usr/sbin/crm_mon for zabbix user in sudoers

 

# vim /etc/sudoers

zabbix          ALL=NOPASSWD: /usr/sbin/crm_mon


3. Configure in Zabbix for active.dc key Trigger and Item

active-node-switch1

Monitoring Linux hardware Hard Drives / Temperature and Disk with lm_sensors / smartd / hddtemp and Zabbix Userparameter lm_sensors report script

Thursday, April 30th, 2020

monitoring-linux-hardware-with-software-temperature-disk-cpu-health-zabbix-userparameter-script

I'm part of a  SysAdmin Team that is partially doing some minor Zabbix imrovements on a custom corporate installed Zabbix in an ongoing project to substitute the previous HP OpenView monitoring for a bunch of Legacy Linux hosts.
As one of the necessery checks to have is regarding system Hardware, the task was to invent some simplistic way to monitor hardware with the Zabbix Monitoring tool.  Monitoring Bare Metal servers hardware of HP / Dell / Fujituse etc. servers  in Linux usually is done with a third party software provided by the Hardware vendor. But as this requires an additional services to run and sometimes is not desired. It was interesting to find out some alternative Linux native ways to do the System hardware monitoring.
Monitoring statistics from the system hardware components can be obtained directly from the server components with ipmi / ipmitool (for more info on it check my previous article Reset and Manage intelligent  Platform Management remote board article).
With ipmi
 hardware health info could be received straight from the ILO / IDRAC / HPMI of the server. However as often the Admin-Lan of the server is in a seperate DMZ secured network and available via only a certain set of routed IPs, ipmitool can't be used.

So what are the other options to use to implement Linux Server Hardware Monitoring?

The tools to use are perhaps many but I know of two which gives you most of the information you ever need to have a prelimitary hardware damage warning system before the crash, these are:
 

1. smartmontools (smartd)

Smartd is part of smartmontools package which contains two utility programs (smartctl and smartd) to control and monitor storage systems using the Self-Monitoring, Analysis and Reporting Technology system (SMART) built into most modern ATA/SATA, SCSI/SAS and NVMe disks

Disk monitoring is handled by a special service the package provides called smartd that does query the Hard Drives periodically aiming to find a warning signs of hardware failures.
The downside of smartd use is that it implies a little bit of extra load on Hard Drive read / writes and if misconfigured could reduce the the Hard disk life time.

 

linux:~#  /usr/sbin/smartctl -a /dev/sdb2
smartctl 6.6 2017-11-05 r4594 [x86_64-linux-4.19.0-5-amd64] (local build)
Copyright (C) 2002-17, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Device Model:     KINGSTON SA400S37240G
Serial Number:    50026B768340AA31
LU WWN Device Id: 5 0026b7 68340aa31
Firmware Version: S1Z40102
User Capacity:    240,057,409,536 bytes [240 GB]
Sector Size:      512 bytes logical/physical
Rotation Rate:    Solid State Device
Device is:        Not in smartctl database [for details use: -P showall]
ATA Version is:   ACS-3 T13/2161-D revision 4
SATA Version is:  SATA 3.2, 6.0 Gb/s (current: 3.0 Gb/s)
Local Time is:    Thu Apr 30 14:05:01 2020 EEST
SMART support is: Available – device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00) Offline data collection activity
                                        was never started.
                                        Auto Offline Data Collection: Disabled.
Self-test execution status:      (   0) The previous self-test routine completed
                                        without error or no self-test has ever
                                        been run.
Total time to complete Offline
data collection:                (  120) seconds.
Offline data collection
capabilities:                    (0x11) SMART execute Offline immediate.
                                        No Auto Offline data collection support.
                                        Suspend Offline collection upon new
                                        command.
                                        No Offline surface scan supported.
                                        Self-test supported.
                                        No Conveyance Self-test supported.
                                        No Selective Self-test supported.
SMART capabilities:            (0x0002) Does not save SMART data before
                                        entering power-saving mode.
                                        Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                                        General Purpose Logging supported.
Short self-test routine
recommended polling time:        (   2) minutes.
Extended self-test routine
recommended polling time:        (  10) minutes.

SMART Attributes Data Structure revision number: 1
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x0032   100   100   000    Old_age   Always       –       100
  9 Power_On_Hours          0x0032   100   100   000    Old_age   Always       –       2820
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       –       21
148 Unknown_Attribute       0x0000   100   100   000    Old_age   Offline      –       0
149 Unknown_Attribute       0x0000   100   100   000    Old_age   Offline      –       0
167 Unknown_Attribute       0x0000   100   100   000    Old_age   Offline      –       0
168 Unknown_Attribute       0x0012   100   100   000    Old_age   Always       –       0
169 Unknown_Attribute       0x0000   100   100   000    Old_age   Offline      –       0
170 Unknown_Attribute       0x0000   100   100   010    Old_age   Offline      –       0
172 Unknown_Attribute       0x0032   100   100   000    Old_age   Always       –       0
173 Unknown_Attribute       0x0000   100   100   000    Old_age   Offline      –       0
181 Program_Fail_Cnt_Total  0x0032   100   100   000    Old_age   Always       –       0
182 Erase_Fail_Count_Total  0x0000   100   100   000    Old_age   Offline      –       0
187 Reported_Uncorrect      0x0032   100   100   000    Old_age   Always       –       0
192 Power-Off_Retract_Count 0x0012   100   100   000    Old_age   Always       –       16
194 Temperature_Celsius     0x0022   034   052   000    Old_age   Always       –       34 (Min/Max 19/52)
196 Reallocated_Event_Count 0x0032   100   100   000    Old_age   Always       –       0
199 UDMA_CRC_Error_Count    0x0032   100   100   000    Old_age   Always       –       0
218 Unknown_Attribute       0x0032   100   100   000    Old_age   Always       –       0
231 Temperature_Celsius     0x0000   097   097   000    Old_age   Offline      –       97
233 Media_Wearout_Indicator 0x0032   100   100   000    Old_age   Always       –       2104
241 Total_LBAs_Written      0x0032   100   100   000    Old_age   Always       –       1857
242 Total_LBAs_Read         0x0032   100   100   000    Old_age   Always       –       1141
244 Unknown_Attribute       0x0000   100   100   000    Old_age   Offline      –       32
245 Unknown_Attribute       0x0000   100   100   000    Old_age   Offline      –       107
246 Unknown_Attribute       0x0000   100   100   000    Old_age   Offline      –       15940

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
No self-tests have been logged.  [To run self-tests, use: smartctl -t]

Selective Self-tests/Logging not supported

 

2. hddtemp

 

Usually if smartd is used it is useful to also use hddtemp which relies on smartd data.
 The hddtemp program monitors and reports the temperature of PATA, SATA
 or SCSI hard drives by reading Self-Monitoring Analysis and Reporting
 Technology (S.M.A.R.T.)
information on drives that support this feature.
 

linux:~# /usr/sbin/hddtemp /dev/sda1
/dev/sda1: Hitachi HDS721050CLA360: 31°C
linux:~# /usr/sbin/hddtemp /dev/sdc6
/dev/sdc6: KINGSTON SV300S37A120G: 25°C
linux:~# /usr/sbin/hddtemp /dev/sdb2
/dev/sdb2: KINGSTON SA400S37240G: 34°C
linux:~# /usr/sbin/hddtemp /dev/sdd1
/dev/sdd1: WD Elements 10B8: S.M.A.R.T. not available

 

 

3. lm-sensors / i2c-tools 

 Lm-sensors is a hardware health monitoring package for Linux. It allows you
 to access information from temperature, voltage, and fan speed sensors.
i2c-tools
was historically bundled in the same package as lm_sensors but has been seperated cause not all hardware monitoring chips are I2C devices, and not all I2C devices are hardware monitoring chips.

The most basic use of lm-sensors is with the sensors command

 

linux:~# sensors
i350bb-pci-0600
Adapter: PCI adapter
loc1:         +55.0 C  (high = +120.0 C, crit = +110.0 C)

 

coretemp-isa-0000
Adapter: ISA adapter
Physical id 0:  +28.0 C  (high = +78.0 C, crit = +88.0 C)
Core 0:         +26.0 C  (high = +78.0 C, crit = +88.0 C)
Core 1:         +28.0 C  (high = +78.0 C, crit = +88.0 C)
Core 2:         +28.0 C  (high = +78.0 C, crit = +88.0 C)
Core 3:         +28.0 C  (high = +78.0 C, crit = +88.0 C)

 


On CentOS Linux useful tool is also  lm_sensors-sensord.x86_64 – A Daemon that periodically logs sensor readings to syslog or a round-robin database, and warns of sensor alarms.

In Debian Linux there is also the psensors-server (an HTTP server providing JSON Web service which can be used by GTK+ Application to remotely monitor sensors) useful for developers
psesors-server

psensor-linux-graphical-tool-to-check-cpu-hard-disk-temperature-unix

If you have a Xserver installed on the Server accessed with Xclient or via VNC though quite rare,
You can use xsensors or Psensora GTK+ (Widget Toolkit for creating Graphical User Interface) application software.

With this 3 tools it is pretty easy to script one liners and use the Zabbix UserParameters functionality to send hardware report data to a Company's Zabbix Sserver, though Zabbix has already some templates to do so in my case, I couldn't import this templates cause I don't have Zabbix Super-Admin credentials, thus to work around that a sample work around is use script to monitor for higher and critical considered temperature.
Here is a tiny sample script I came up in 1 min time it can be used to used as 1 liner UserParameter and built upon something more complex.

SENSORS_HIGH=`sensors | awk '{ print $6 }'| grep '^+' | uniq`;
SENSORS_CRIT=`sensors | awk '{ print $9 }'| grep '^+' | uniq`; ;SENSORS_STAT=`sensors|grep -E 'Core\s' | awk '{ print $1" "$2" "$3 }' | grep "$SENSORS_HIGH|$SENSORS_CRIT"`;
if [ ! -z $SENSORS_STAT ]; then
echo 'Temperature HIGH';
else 
echo 'Sensors OK';
fi 

Of course there is much more sophisticated stuff to use for monitoring out there


Below script can be easily adapted and use on other Monitoring Platforms such as Nagios / Munin / Cacti / Icinga and there are plenty of paid solutions, but for anyone that wants to develop something from scratch just like me I hope this
article will be a good short introduction.
If you know some other Linux hardware monitoring tools, please share.

How to Import Remove List archive signing keys on CentOS / RHEL / Fedora RPM based Linux distributions

Wednesday, April 8th, 2020

how-to-import-remove-list-archiving-signing-keys-on-CentOS-RHEL-Fedora-rpm-based-Linux-distros-package
If you  plan to build and distribute  own RPMs securely, it is strongly recommended that all custom RPMs are signed using GNU Privacy Guard (GPG). Generating GPG keys and building GPG-signed packages matching it.
Hence, If you have to deal with some of the RPM based package management Linux distribution like CentOS / RHEL / Fedora etc. you will sooner or later end up in a situation where some of the archive signing keys for a package provided by some of the repositories is missing or it is not matching the keys provided for the RPM repo.

As a result you will be unable to install some package like lets say zabbix-sender or you won't be able to update a certain package to the latest available version, because the Archive Signing key is not found.
The usual naming for a RPM file with a GPG key in is YOUR-RPM-GPG-KEY.

A typical PGP Public key file content looks something like this:
 

—–BEGIN PGP PUBLIC KEY BLOCK—–

Version: GnuPG v1.0.0 (GNU/Linux)

Comment: For info see http://www.gnupg.org

mQGiBDfqVEqRBADBKr3Bl6PO8BQ0H8sJoD6p9U7Yyl7pjtZqioviPwXP+DCWd4u8

HQzcxAZ57m8ssA1LK1Fx93coJhDzM130+p5BG9mYSPShLabR3N1KXdXAYYcowTOM

GxdwYRGr1Spw8QydLhjVfU1VSl4xt6bupPbFJbyjkg5Z3P7BlUOUJmrx3wCgobNV

EDGaWYJcch5z5B1of/41G8kEAKii6q7Gu/vhXXnLS6m15oNnPVybyngiw/23dKjS

ti/PYrrL2J11P2ed0x7zm8v3gLrY0cue1iSba+8glY+p31ZPOr5ogaJw7ZARgoS8

BwjyRymXQp+8Dete0TELKOL2/itDOPGHW07SsVWOR6cmX4VlRRcWB5KejaNvdrE5

4XFtOd04NMgWI63uqZc4zkRa+kwEZtmbz3tHSdWCCE+Y7YVP6IUf/w6YPQFQriWY

FiA6fD10eB+BlIUqIw80EqjsBKmCwvKkn4jg8kibUgj4/TzQSx77uYokw1EqQ2wk

OZoaEtcubsNMquuLCMWijYhGBBgRAgAGBQI36lRyAAoJECGRgM3bQqYOhyYAnj7h

VDY/FJAGqmtZpwVp9IlitW5tAJ4xQApr/jNFZCTksnI+4O1765F7tA==

=3AHZ

—–END PGP PUBLIC KEY BLOCK—–

 

The usual naming for a RPM file with a GPG key in is YOUR-RPM-GPG-KEY
 

1. List RPM gpg keys installed on system

To list all the installed RPM gpg keys on the system do:

rpm -q gpg-pubkey


To get a list of the number of installed keys with verbose info with key description::

rpm -qa gpg-pubkey –qf "%{version}-%{release} %{summary}\n"|wc -l

 rpm -qa gpg-pubkey –qf "%{version}-%{release} %{summary}\n"
fdb19c98-56fd6333 gpg(Fedora 25 Primary (25) <fedora-25-primary@fedoraproject.org>)
7fac5991-4615767f gpg(Google, Inc. Linux Package Signing Key <linux-packages-keymaster@google.com>)
64dab85d-57d33e22 gpg(Fedora 26 Primary (26) <fedora-26-primary@fedoraproject.org>)
fa7a179a-562bcd6e gpg(RPM Fusion nonfree repository for Fedora (25) <rpmfusion-buildsys@lists.rpmfusion.org>)
6806a9cb-562bce39 gpg(RPM Fusion free repository for Fedora (25) <rpmfusion-buildsys@lists.rpmfusion.org>)
d38b4796-570c8cd3 gpg(Google Inc. (Linux Packages Signing Authority) <linux-packages-keymaster@google.com>)

 

[root@host ~:]# rpm -q gpg-pubkey –qf '%{NAME}-%{VERSION}-%{RELEASE}\t%{SUMMARY}\n'
gpg-pubkey-f4a80eb5-53a7ff4b    gpg(CentOS-7 Key (CentOS 7 Official Signing Key) )
gpg-pubkey-b6792c39-53c4fbdd    gpg(CentOS-7 Debug (CentOS-7 Debuginfo RPMS) )
gpg-pubkey-8fae34bd-538f1e51    gpg(CentOS-7 Testing (CentOS 7 Testing content) )

To list all OS installed gpg keys do:

[user@host ~:]$ $ rpm -qa | grep -i gpg
gpg-pubkey-db42a60e-37ea5438


2. Import RPM-GPG-KEY


A new key be it official archive keys issued from Fedora or a custom own build RPM package can be imported Redhat Package Manager like so:

[root@host ~:]# rpm –import RPM-GPG-KEY


It is possible to also import multiple GPG signature keys, for example on CentOS the usual path containg keys is /etc/pki/rpm-gpg/ to import all of the contained files there:

[root@host ~:]# rpm –import /etc/pki/rpm-gpg/*


3. Check package with imported gpg arch key


Once the RPM-GPG-KEY is imported you can compare whether a RPM package matches with the key signature.

[root@host ~:]# rpm –checksig package-1.3-3.src.rpm

[root@host ~:]# rpm –checksig xtoolwait-1.3-3.src.rpm
package-1.3-3.src.rpm: (sha1) dsa sha1 md5 gpg OK


4. Remove RPM installed arch key


If you have installed some gpg arch. key by mistake and you need to remove it:

[root@host ~:]#rpm -e gpg-pubkey-b6792c39-53c4fbdd


To make sure it is remove do a Listing once again signing archive keys, it should not show anymore:

[root@host ~:]# rpm -q gpg-pubkey –qf '%{NAME}-%{VERSION}-%{RELEASE}\t%{SUMMARY}\n'
gpg-pubkey-f4a80eb5-53a7ff4b    gpg(CentOS-7 Key (CentOS 7 Official Signing Key) )
gpg-pubkey-8fae34bd-538f1e51    gpg(CentOS-7 Testing (CentOS 7 Testing content) )

Find when cron.daily cron.weekly and cron.monthly run on Redhat / CentOS / Debian Linux and systemd-timers

Wednesday, March 25th, 2020

Find-when-cron.daily-cron.monthly-cron.weekly-run-on-Redhat-CentOS-Debian-SuSE-SLES-Linux-cron-logo

 

The problem – Apache restart at random times


I've noticed today something that is occuring for quite some time but was out of my scope for quite long as I'm not directly involved in our Alert monitoring at my daily job as sys admin. Interestingly an Apache HTTPD webserver is triggering alarm twice a day for a short downtime that lasts for 9 seconds.

I've decided to investigate what is triggering WebServer restart in such random time and investigated on the system for any background running scripts as well as reviewed the system logs. As I couldn't find nothing there the only logical place to check was cron jobs.
The usual
 

crontab -u root -l


Had no configured cron jobbed scripts so I digged further to check whether there isn't cron jobs records for a script that is triggering the reload of Apache in /etc/crontab /var/spool/cron/root and /var/spool/cron/httpd.
Nothing was found there and hence as there was no anacron service running but /usr/sbin/crond the other expected place to look up for a trigger even was /etc/cron*

 

1. Configured default cron execution times, every day, every hour every month

 

# ls -ld /etc/cron.*
drwxr-xr-x 2 root root 4096 feb 27 10:54 /etc/cron.d/
drwxr-xr-x 2 root root 4096 dec 27 10:55 /etc/cron.daily/
drwxr-xr-x 2 root root 4096 dec  7 23:04 /etc/cron.hourly/
drwxr-xr-x 2 root root 4096 dec  7 23:04 /etc/cron.monthly/
drwxr-xr-x 2 root root 4096 dec  7 23:04 /etc/cron.weekly/

 

After a look up to each of above directories, finally I found the very expected logrorate shell script set to execute from /etc/cron.daily/logrotate and inside it I've found after the log files were set to be gzipped and moved to execute WebServer restart with:

systemctl reload httpd 

 

My first reaction was to ponder seriously why the script is invoking systemctl reload httpd instead of the good oldschool

apachectl -k graceful

 

But it seems on Redhat and CentOS since RHEL / CentOS version 6.X onwards systemctl reload httpd is supposed to be identical and a substitute for apachectl -k graceful.
Okay the craziness of innovation continued as obviously the reload was causing a Downtime to be visible in the Zabbix HTTPD port Monitoring graph …
Now as the problem was identified the other logical question poped up how to find out what is the exact timing scheduled to run the script in that unusual random times each time ??
 

2. Find out cron scripts timing Redhat / CentOS / Fedora / SLES

 

/etc/cron.{daily,monthly,weekly} placed scripts's execution method has changed over the years, causing a chaos just like many Linux standard things we know due to the inclusion of systemd and some other additional weird OS design changes. The result is the result explained above scripts are running at a strange unexpeted times … one thing that was intruduced was anacron – which is also executing commands periodically with a different preset frequency. However it is considered more thrustworhty by crond daemon, because anacron does not assume the machine is continuosly running and if the machine is down due to a shutdown or a failure (if it is a Virtual Machine) or simply a crond dies out, some cronjob necessery for overall set environment or application might not run, what anacron guarantees is even though that and even if crond is in unworking defunct state, the preset scheduled scripts will still be served.
anacron's default file location is in /etc/anacrontab.

A standard /etc/anacrontab looks like so:
 

[root@centos ~]:# cat /etc/anacrontab
# /etc/anacrontab: configuration file for anacron
 
# See anacron(8) and anacrontab(5) for details.
 
SHELL=/bin/sh
PATH=/sbin:/bin:/usr/sbin:/usr/bin
MAILTO=root
# the maximal random delay added to the base delay of the jobs
RANDOM_DELAY=45
# the jobs will be started during the following hours only
START_HOURS_RANGE=3-22
 
#period in days   delay in minutes   job-identifier   command
1    5    cron.daily        nice run-parts /etc/cron.daily
7    25    cron.weekly        nice run-parts /etc/cron.weekly
@monthly 45    cron.monthly        nice run-parts /etc/cron.monthly

 

START_HOURS_RANGE : The START_HOURS_RANGE variable sets the time frame, when the job could started. 
The jobs will start during the 3-22 (3AM-10PM) hours only.

  • cron.daily will run at 3:05 (After Midnight) A.M. i.e. run once a day at 3:05AM.
  • cron.weekly will run at 3:25 AM i.e. run once a week at 3:25AM.
  • cron.monthly will run at 3:45 AM i.e. run once a month at 3:45AM.

If the RANDOM_DELAY env var. is set, a random value between 0 and RANDOM_DELAY minutes will be added to the start up delay of anacron served jobs. 
For instance RANDOM_DELAY equels 45 would therefore add, randomly, between 0 and 45 minutes to the user defined delay. 

Delay will be 5 minutes + RANDOM_DELAY for cron.daily for above cron.daily, cron.weekly, cron.monthly config records, i.e. 05:01 + 0-45 minutes

A full detailed explanation on automating system tasks on Redhat Enterprise Linux is worthy reading here.

!!! Note !!! that listed jobs will be running in queue. After one finish, then next will start.
 

3. SuSE Enterprise Linux cron jobs not running at desired times why?


in SuSE it is much more complicated to have a right timing for standard default cron jobs that comes preinstalled with a service 

In older SLES release /etc/crontab looked like so:

 

SHELL=/bin/bash
PATH=/sbin:/bin:/usr/sbin:/usr/bin
MAILTO=root
HOME=/

# run-parts
01 * * * * root run-parts /etc/cron.hourly
02 4 * * * root run-parts /etc/cron.daily
22 4 * * 0 root run-parts /etc/cron.weekly
42 4 1 * * root run-parts /etc/cron.monthly


As time of writting article it looks like:

 

SHELL=/bin/sh
PATH=/usr/bin:/usr/sbin:/sbin:/bin:/usr/lib/news/bin
MAILTO=root
#
# check scripts in cron.hourly, cron.daily, cron.weekly, and cron.monthly
#
-*/15 * * * *   root  test -x /usr/lib/cron/run-crons && /usr/lib/cron/run-crons >/dev/null 2>&1

 

 


This runs any scripts placed in /etc/cron.{hourly, daily, weekly, monthly} but it may not run them when you expect them to run. 
/usr/lib/cron/run-crons compares the current time to the /var/spool/cron/lastrun/cron.{time} file to determine if those jobs need to be run.

For hourly, it checks if the current time is greater than (or exactly) 60 minutes past the timestamp of the /var/spool/cron/lastrun/cron.hourly file.

For weekly, it checks if the current time is greater than (or exactly) 10080 minutes past the timestamp of the /var/spool/cron/lastrun/cron.weekly file.

Monthly uses a caclucation to check the time difference, but is the same type of check to see if it has been one month after the last run.

Daily has a couple variations available – By default it checks if it is more than or exactly 1440 minutes since lastrun.
If DAILY_TIME is set in the /etc/sysconfig/cron file (again a suse specific innovation), then that is the time (within 15minutes) when daily will run.

For systems that are powered off at DAILY_TIME, daily tasks will run at the DAILY_TIME, unless it has been more than x days, if it is, they run at the next running of run-crons. (default 7days, can set shorter time in /etc/sysconfig/cron.)
Because of these changes, the first time you place a job in one of the /etc/cron.{time} directories, it will run the next time run-crons runs, which is at every 15mins (xx:00, xx:15, xx:30, xx:45) and that time will be the lastrun, and become the normal schedule for future runs. Note that there is the potential that your schedules will begin drift by 15minute increments.

As you see this is very complicated stuff and since God is in the simplicity it is much better to just not use /etc/cron.* for whatever scripts and manually schedule each of the system cron jobs and custom scripts with cron at specific times.


4. Debian Linux time start schedule for cron.daily / cron.monthly / cron.weekly timing

As the last many years many of the servers I've managed were running Debian GNU / Linux, my first place to check was /etc/crontab which is the standard cronjobs file that is setting the { daily , monthly , weekly crons } 

 

 debian:~# ls -ld /etc/cron.*
drwxr-xr-x 2 root root 4096 фев 27 10:54 /etc/cron.d/
drwxr-xr-x 2 root root 4096 фев 27 10:55 /etc/cron.daily/
drwxr-xr-x 2 root root 4096 дек  7 23:04 /etc/cron.hourly/
drwxr-xr-x 2 root root 4096 дек  7 23:04 /etc/cron.monthly/
drwxr-xr-x 2 root root 4096 дек  7 23:04 /etc/cron.weekly/

 

debian:~# cat /etc/crontab 
# /etc/crontab: system-wide crontab
# Unlike any other crontab you don't have to run the `crontab'
# command to install the new version when you edit this file
# and files in /etc/cron.d. These files also have username fields,
# that none of the other crontabs do.

SHELL=/bin/sh
PATH=/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin# Example of job definition:
# .—————- minute (0 – 59)
# |  .————- hour (0 – 23)
# |  |  .———- day of month (1 – 31)
# |  |  |  .——- month (1 – 12) OR jan,feb,mar,apr …
# |  |  |  |  .—- day of week (0 – 6) (Sunday=0 or 7) OR sun,mon,tue,wed,thu,fri,sat
# |  |  |  |  |
# *  *  *  *  * user-name command to be executed
17 *    * * *    root    cd / && run-parts –report /etc/cron.hourly
25 6    * * *    root    test -x /usr/sbin/anacron || ( cd / && run-parts –report /etc/cron.daily )
47 6    * * 7    root    test -x /usr/sbin/anacron || ( cd / && run-parts –report /etc/cron.weekly )
52 6    1 * *    root    test -x /usr/sbin/anacron || ( cd / && run-parts –report /etc/cron.monthly )

What above does is:

– Run cron.hourly once at every hour at 1:17 am
– Run cron.daily once at every day at 6:25 am.
– Run cron.weekly once at every day at 6:47 am.
– Run cron.monthly once at every day at 6:42 am.

As you can see if anacron is present on the system it is run via it otherwise it is run via run-parts binary command which is reading and executing one by one all scripts insude /etc/cron.hourly, /etc/cron.weekly , /etc/cron.mothly

anacron – few more words

Anacron is the canonical way to run at least the jobs from /etc/cron.{daily,weekly,monthly) after startup, even when their execution was missed because the system was not running at the given time. Anacron does not handle any cron jobs from /etc/cron.d, so any package that wants its /etc/cron.d cronjob being executed by anacron needs to take special measures.

If anacron is installed, regular processing of the /etc/cron.d{daily,weekly,monthly} is omitted by code in /etc/crontab but handled by anacron via /etc/anacrontab. Anacron's execution of these job lists has changed multiple times in the past:

debian:~# cat /etc/anacrontab 
# /etc/anacrontab: configuration file for anacron

# See anacron(8) and anacrontab(5) for details.

SHELL=/bin/sh
PATH=/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin
HOME=/root
LOGNAME=root

# These replace cron's entries
1    5    cron.daily    run-parts –report /etc/cron.daily
7    10    cron.weekly    run-parts –report /etc/cron.weekly
@monthly    15    cron.monthly    run-parts –report /etc/cron.monthly

In wheezy and earlier, anacron is executed via init script on startup and via /etc/cron.d at 07:30. This causes the jobs to be run in order, if scheduled, beginning at 07:35. If the system is rebooted between midnight and 07:35, the jobs run after five minutes of uptime.
In stretch, anacron is executed via a systemd timer every hour, including the night hours. This causes the jobs to be run in order, if scheduled, beween midnight and 01:00, which is a significant change to the previous behavior.
In buster, anacron is executed via a systemd timer every hour with the exception of midnight to 07:00 where anacron is not invoked. This brings back a bit of the old timing, with the jobs to be run in order, if scheduled, beween 07:00 and 08:00. Since anacron is also invoked once at system startup, a reboot between midnight and 08:00 also causes the jobs to be scheduled after five minutes of uptime.
anacron also didn't have an upstream release in nearly two decades and is also currently orphaned in Debian.

As of 2019-07 (right after buster's release) it is planned to have cron and anacron replaced by cronie.

cronie – Cronie was forked by Red Hat from ISC Cron 4.1 in 2007, is the default cron implementation in Fedora and Red Hat Enterprise Linux at least since Version 6. cronie seems to have an acive upstream, but is currently missing some of the things that Debian has added to vixie cron over the years. With the finishing of cron's conversion to quilt (3.0), effort can begin to add the Debian extensions to Vixie cron to cronie.

Because cronie doesn't have all the Debian extensions yet, it is not yet suitable as a cron replacement, so it is not in Debian.
 

5. systemd-timers – The new crazy systemd stuff for script system job scheduling


Timers are systemd unit files with a suffix of .timer. systemd-timers was introduced with systemd so older Linux OS-es does not have it.
 Timers are like other unit configuration files and are loaded from the same paths but include a [Timer] section which defines when and how the timer activates. Timers are defined as one of two types:

 

  • Realtime timers (a.k.a. wallclock timers) activate on a calendar event, the same way that cronjobs do. The option OnCalendar= is used to define them.
  • Monotonic timers activate after a time span relative to a varying starting point. They stop if the computer is temporarily suspended or shut down. There are number of different monotonic timers but all have the form: OnTypeSec=. Common monotonic timers include OnBootSec and OnActiveSec.

     

     

    For each .timer file, a matching .service file exists (e.g. foo.timer and foo.service). The .timer file activates and controls the .service file. The .service does not require an [Install] section as it is the timer units that are enabled. If necessary, it is possible to control a differently-named unit using the Unit= option in the timer’s [Timer] section.

    systemd-timers is a complex stuff and I'll not get into much details but the idea was to give awareness of its existence for more info check its manual man systemd.timer

Its most basic use is to list all configured systemd.timers, below is from my home Debian laptop
 

debian:~# systemctl list-timers –all
NEXT                         LEFT         LAST                         PASSED       UNIT                         ACTIVATES
Tue 2020-03-24 23:33:58 EET  18s left     Tue 2020-03-24 23:31:28 EET  2min 11s ago laptop-mode.timer            lmt-poll.service
Tue 2020-03-24 23:39:00 EET  5min left    Tue 2020-03-24 23:09:01 EET  24min ago    phpsessionclean.timer        phpsessionclean.service
Wed 2020-03-25 00:00:00 EET  26min left   Tue 2020-03-24 00:00:01 EET  23h ago      logrotate.timer              logrotate.service
Wed 2020-03-25 00:00:00 EET  26min left   Tue 2020-03-24 00:00:01 EET  23h ago      man-db.timer                 man-db.service
Wed 2020-03-25 02:38:42 EET  3h 5min left Tue 2020-03-24 13:02:01 EET  10h ago      apt-daily.timer              apt-daily.service
Wed 2020-03-25 06:13:02 EET  6h left      Tue 2020-03-24 08:48:20 EET  14h ago      apt-daily-upgrade.timer      apt-daily-upgrade.service
Wed 2020-03-25 07:31:57 EET  7h left      Tue 2020-03-24 23:30:28 EET  3min 11s ago anacron.timer                anacron.service
Wed 2020-03-25 17:56:01 EET  18h left     Tue 2020-03-24 17:56:01 EET  5h 37min ago systemd-tmpfiles-clean.timer systemd-tmpfiles-clean.service

 

8 timers listed.


N ! B! If a timer gets out of sync, it may help to delete its stamp-* file in /var/lib/systemd/timers (or ~/.local/share/systemd/ in case of user timers). These are zero length files which mark the last time each timer was run. If deleted, they will be reconstructed on the next start of their timer.

Summary

In this article, I've shortly explain logic behind debugging weird restart events etc. of Linux configured services such as Apache due to configured scripts set to run with a predefined scheduled job timing. I shortly explained on how to figure out why the preset default install configured cron jobs such as logrorate – the service that is doing system logs archiving and nulling run at a certain time. I shortly explained the mechanism behind cron.{daily, monthy, weekly} and its execution via anacron – runner program similar to crond that never misses to run a scheduled job even if a system downtime occurs due to a crashed Docker container etc. run-parts command's use was shortly explained. A short look at systemd.timers was made which is now essential part of almost every new Linux release and often used by system scripts for scheduling time based maintainance tasks.