Posts Tagged ‘uid’

How to set up Notify by email expiring local UNIX user accounts on Linux / BSD with a bash script

Thursday, August 24th, 2023

password-expiry-linux-tux-logo-script-picture-how-to-notify-if-password-expires-on-unix

If you have already configured Linux Local User Accounts Password Security policies Hardening – Set Password expiry, password quality, limit repatead access attempts, add directionary check, increase logged history command size and you want your configured local user accounts on a Linux / UNIX / BSD system to not expire before the user is reminded that it will be of his benefit to change his password on time, not to completely loose account to his account, then you might use a small script that is just checking the upcoming expiry for a predefined users and emails in an array with lslogins command like you will learn in this article.

The script below is written by a colleague Lachezar Pramatarov (Credit for the script goes to him) in order to solve this annoying expire problem, that we had all the time as me and colleagues often ended up with expired accounts and had to bother to ask for the password reset and even sometimes clearance of account locks. Hopefully this little script will help some other unix legacy admin systems to get rid of the account expire problem.

For the script to work you will need to have a properly configured SMTP (Mail server) with or without a relay to be able to send to the script predefined email addresses that will get notified. 

Here is example of a user whose account is about to expire in a couple of days and who will benefit of getting the Alert that he should hurry up to change his password until it is too late 🙂

[root@linux ~]# date
Thu Aug 24 17:28:18 CEST 2023

[root@server~]# chage -l lachezar
Last password change                                    : May 30, 2023
Password expires                                        : Aug 28, 2023
Password inactive                                       : never
Account expires                                         : never
Minimum number of days between password change          : 0
Maximum number of days between password change          : 90
Number of days of warning before password expires       : 14

Here is the user_passwd_expire.sh that will report the user

# vim  /usr/local/bin/user_passwd_expire.sh

#!/bin/bash

# This script will send warning emails for password expiration 
# on the participants in the following list:
# 20, 15, 10 and 0-7 days before expiration
# ! Script sends expiry Alert only if day is Wednesday – if (( $(date +%u)==3 )); !

# email to send if expiring
alert_email='alerts@pc-freak.net';
# the users that are admins added to belong to this group
admin_group="admins";
notify_email_header_customer_name='Customer Name';

declare -A mails=(
# list below accounts which will receive account expiry emails

# syntax to define uid / email
# [“account_name_from_etc_passwd”]="real_email_addr@fqdn";

#    [“abc”]="abc@fqdn.com"
#    [“cba”]="bca@fqdn.com"
    [“lachezar”]="lachezar.user@gmail.com"
    [“georgi”]="georgi@fqdn-mail.com"
    [“acct3”]="acct3@fqdn-mail.com"
    [“acct4”]="acct4@fqdn-mail.com"
    [“acct5”]="acct5@fqdn-mail.com"
    [“acct6”]="acct6@fqdn-mail.com"
#    [“acct7”]="acct7@fqdn-mail.com"
#    [“acct8”]="acct8@fqdn-mail.com"
#    [“acct9”]="acct9@fqdn-mail.com"
)

declare -A days

while IFS="=" read -r person day ; do
  days[“$person”]="$day"
done < <(lslogins –noheadings -o USER,GROUP,PWD-CHANGE,PWD-WARN,PWD-MIN,PWD-MAX,PWD-EXPIR,LAST-LOGIN,FAILED-LOGIN  –time-format=iso | awk '{print "echo "$1" "$2" "$3" $(((($(date +%s -d \""$3"+90 days\")-$(date +%s)))/86400)) "$5}' | /bin/bash | grep -E " $admin_group " | awk '{print $1 "=" $4}')

#echo ${days[laprext]}
for person in "${!mails[@]}"; do
     echo "$person ${days[$person]}";
     tmp=${days[$person]}

#     echo $tmp
# each person will receive mails only if 20th days / 15th days / 10th days remaining till expiry or if less than 7 days receive alert mail every day

     if  (( (${tmp}==20) || (${tmp}==15) || (${tmp}==10) || ((${tmp}>=0) && (${tmp}<=7)) )); 
     then
         echo "Hello, your password for $(hostname -s) will expire after ${days[$person]} days.” | mail -s “$notify_email_header_customer_name $(hostname -s) server password expiration”  -r passwd_expire ${mails[$person]};
     elif ((${tmp}<0));
     then
#          echo "The password for $person on $(hostname -s) has EXPIRED before{days[$person]} days. Please take an action ASAP.” | mail -s “EXPIRED password of  $person on $(hostname -s)”  -r EXPIRED ${mails[$person]};

# ==3 meaning day is Wednesday the day on which OnCall Person changes

        if (( $(date +%u)==3 ));
        then
             echo "The password for $person on $(hostname -s) has EXPIRED. Please take an action." | mail -s "EXPIRED password of  $person on $(hostname -s)"  -r EXPIRED $alert_email;
        fi
     fi  
done

 


To make the script notify about expiring user accounts, place the script under some directory lets say /usr/local/bin/user_passwd_expire.sh and make it executable and configure a cron job that will schedule it to run every now and then.

# cat /etc/cron.d/passwd_expire_cron

# /etc/cron.d/pwd_expire
#
# Check password expiration for users
#
# 2023-01-16 LPR
#
02 06 * * * root /usr/local/bin/user_passwd_expire.sh >/dev/null

Script will execute every day morning 06:02 by the cron job and if the day is wednesday (3rd day of week) it will send warning emails for password expiration if 20, 15, 10 days are left before account expires if only 7 days are left until the password of user acct expires, the script will start sending the Alarm every single day for 7th, 6th … 0 day until pwd expires.

If you don't have an expiring accounts and you want to force a specific account to have a expire date you can do it with:

# chage -E 2023-08-30 someuser


Or set it for new created system users with:

# useradd -e 2023-08-30 username


That's it the script will notify you on User PWD expiry.

If you need to for example set a single account to expire 90 days from now (3 months) that is a kind of standard password expiry policy admins use, do it with:

# date -d "90 days" +"%Y-%m-%d"
2023-11-22


Ideas for user_passwd_expire.sh script improvement
 

The downside of the script if you have too many local user accounts is you have to hardcode into it the username and user email_address attached to and that would be tedios task if you have 100+ accounts. 

However it is pretty easy if you already have a multitude of accounts in /etc/passwd that are from UID range to loop over them in a small shell loop and build new array from it. Of course for a solution like this to work you will have to have defined as user data as GECOS with command like chfn.
 

[georgi@server ~]$ chfn
Changing finger information for test.
Name [test]: 
Office []: georgi@fqdn-mail.com
Office Phone []: 
Home Phone []: 

Password: 

[root@server test]# finger georgi
Login: georgi                       Name: georgi
Directory: /home/georgi                   Shell: /bin/bash
Office: georgi@fqdn-mail.com
On since чт авг 24 17:41 (EEST) on :0 from :0 (messages off)
On since чт авг 24 17:43 (EEST) on pts/0 from :0
   2 seconds idle
On since чт авг 24 17:44 (EEST) on pts/1 from :0
   49 minutes 30 seconds idle
On since чт авг 24 18:04 (EEST) on pts/2 from :0
   32 minutes 42 seconds idle
New mail received пт окт 30 17:24 2020 (EET)
     Unread since пт окт 30 17:13 2020 (EET)
No Plan.

Then it should be relatively easy to add the GECOS for multilpe accounts if you have them predefined in a text file for each existing local user account.

Hope this script will help some sysadmin out there, many thanks to Lachezar for allowing me to share the script here.
Enjoy ! 🙂

List and fix failed systemd failed services after Linux OS upgrade and how to get full info about systemd service from jorunal log

Friday, February 25th, 2022

systemd-logo-unix-linux-list-failed-systemd-services

I have recently upgraded a number of machines from Debian 10 Buster to Debian 11 Bullseye. The update as always has some issues on some machines, such as problem with package dependencies, changing a number of external package repositories etc. to match che Bullseye deb packages. On some machines the update was less painful on others but the overall line was that most of the machines after the update ended up with one or more failed systemd services. It could be that some of the machines has already had this failed services present and I never checked them from the previous time update from Debian 9 -> Debian 10 or just some mess I've left behind in the hurry when doing software installation in the past. This doesn't matter anyways the fact was that I had to deal to a number of systemctl services which I managed to track by the Failed service mesage on system boot on one of the physical machines and on the OpenXen VTY Console the rest of Virtual Machines after update had some Failed messages. Thus I've spend some good amount of time like an overall of a day or two fixing strange failed services. This is how this small article was born in attempt to help sysadmins or any home Linux desktop users, who has updated his Debian Linux / Ubuntu or any other deb based distribution but due to the chaotic nature of Linux has ended with same strange Failed services and look for a way to find the source of the failures and get rid of the problems. 
Systemd is a very complicated system and in my many sysadmin opinion it makes more problems than it solves, but okay for today's people's megalomania mindset it matches well.

Systemd_components-systemd-journalctl-cgroups-loginctl-nspawn-analyze.svg

 

1. Check the journal for errors, running service irregularities and so on
 

First thing to do to track for errors, right after the update is to take some minutes and closely check,, the journalctl for any strange errors, even on well maintained Unix machines, this journal log would bring you to a problem that is not fatal but still some process or stuff is malfunctioning in the background that you would like to solve:
 

root@pcfreak:~# journalctl -x
Jan 10 10:10:01 pcfreak CRON[17887]: pam_unix(cron:session): session closed for user root
Jan 10 10:10:01 pcfreak audit[17887]: USER_END pid=17887 uid=0 auid=0 ses=340858 subj==unconfined msg='op=PAM:session_close grantors=pam_loginuid,pam_env,pam_env,pam_permit>
Jan 10 10:10:01 pcfreak audit[17888]: CRED_DISP pid=17888 uid=0 auid=0 ses=340860 subj==unconfined msg='op=PAM:setcred grantors=pam_permit acct="root" exe="/usr/sbin/cron" >
Jan 10 10:10:01 pcfreak CRON[17888]: pam_unix(cron:session): session closed for user root
Jan 10 10:10:01 pcfreak audit[17888]: USER_END pid=17888 uid=0 auid=0 ses=340860 subj==unconfined msg='op=PAM:session_close grantors=pam_loginuid,pam_env,pam_env,pam_permit>
Jan 10 10:10:01 pcfreak audit[17884]: CRED_DISP pid=17884 uid=0 auid=0 ses=340855 subj==unconfined msg='op=PAM:setcred grantors=pam_permit acct="root" exe="/usr/sbin/cron" >
Jan 10 10:10:01 pcfreak CRON[17884]: pam_unix(cron:session): session closed for user root
Jan 10 10:10:01 pcfreak audit[17884]: USER_END pid=17884 uid=0 auid=0 ses=340855 subj==unconfined msg='op=PAM:session_close grantors=pam_loginuid,pam_env,pam_env,pam_permit>
Jan 10 10:10:01 pcfreak audit[17886]: CRED_DISP pid=17886 uid=0 auid=33 ses=340859 subj==unconfined msg='op=PAM:setcred grantors=pam_permit acct="www-data" exe="/usr/sbin/c>
Jan 10 10:10:01 pcfreak CRON[17886]: pam_unix(cron:session): session closed for user www-data
Jan 10 10:10:01 pcfreak audit[17886]: USER_END pid=17886 uid=0 auid=33 ses=340859 subj==unconfined msg='op=PAM:session_close grantors=pam_loginuid,pam_env,pam_env,pam_permi>
Jan 10 10:10:08 pcfreak NetworkManager[696]:  [1641802208.0899] device (eth1): carrier: link connected
Jan 10 10:10:08 pcfreak kernel: r8169 0000:03:00.0 eth1: Link is Up – 100Mbps/Full – flow control rx/tx
Jan 10 10:10:08 pcfreak kernel: r8169 0000:03:00.0 eth1: Link is Down
Jan 10 10:10:19 pcfreak NetworkManager[696]:
 [1641802219.7920] device (eth1): carrier: link connected
Jan 10 10:10:19 pcfreak kernel: r8169 0000:03:00.0 eth1: Link is Up – 100Mbps/Full – flow control rx/tx
Jan 10 10:10:20 pcfreak kernel: r8169 0000:03:00.0 eth1: Link is Down
Jan 10 10:10:22 pcfreak NetworkManager[696]:
 [1641802222.2772] device (eth1): carrier: link connected
Jan 10 10:10:22 pcfreak kernel: r8169 0000:03:00.0 eth1: Link is Up – 100Mbps/Full – flow control rx/tx
Jan 10 10:10:23 pcfreak kernel: r8169 0000:03:00.0 eth1: Link is Down
Jan 10 10:10:33 pcfreak sshd[18142]: Unable to negotiate with 66.212.17.162 port 19255: no matching key exchange method found. Their offer: diffie-hellman-group14-sha1,diff>
Jan 10 10:10:41 pcfreak NetworkManager[696]:
 [1641802241.0186] device (eth1): carrier: link connected
Jan 10 10:10:41 pcfreak kernel: r8169 0000:03:00.0 eth1: Link is Up – 100Mbps/Full – flow control rx/tx

If you want to only check latest journal log messages use the -x -e (pager catalog) opts

root@pcfreak;~# journalctl -xe

Feb 25 13:08:29 pcfreak audit[2284920]: USER_LOGIN pid=2284920 uid=0 auid=4294967295 ses=4294967295 subj==unconfined msg='op=login acct=28696E76616C>
Feb 25 13:08:29 pcfreak sshd[2284920]: Received disconnect from 177.87.57.145 port 40927:11: Bye Bye [preauth]
Feb 25 13:08:29 pcfreak sshd[2284920]: Disconnected from invalid user ubuntuuser 177.87.57.145 port 40927 [preauth]

Next thing to after the update was to get a list of failed service only.


2. List all systemd failed check services which was supposed to be running

root@pcfreak:/root # systemctl list-units | grep -i failed
● certbot.service                                                                                                       loaded failed failed    Certbot
● logrotate.service                                                                                                     loaded failed failed    Rotate log files
● maldet.service                                                                                                        loaded failed failed    LSB: Start/stop maldet in monitor mode
● named.service                                                                                                         loaded failed failed    BIND Domain Name Server


Alternative way is with the –failed option

hipo@jeremiah:~$ systemctl list-units –failed
  UNIT                        LOAD   ACTIVE SUB    DESCRIPTION
● haproxy.service             loaded failed failed HAProxy Load Balancer
● libvirt-guests.service      loaded failed failed Suspend/Resume Running libvirt Guests
● libvirtd.service            loaded failed failed Virtualization daemon
● nvidia-persistenced.service loaded failed failed NVIDIA Persistence Daemon
● sqwebmail.service           masked failed failed sqwebmail.service
● tpm2-abrmd.service          loaded failed failed TPM2 Access Broker and Resource Management Daemon
● wd_keepalive.service        loaded failed failed LSB: Start watchdog keepalive daemon

LOAD   = Reflects whether the unit definition was properly loaded.
ACTIVE = The high-level unit activation state, i.e. generalization of SUB.
SUB    = The low-level unit activation state, values depend on unit type.
7 loaded units listed.

 

root@jeremiah:/etc/apt/sources.list.d#  systemctl list-units –failed
  UNIT                        LOAD   ACTIVE SUB    DESCRIPTION
● haproxy.service             loaded failed failed HAProxy Load Balancer
● libvirt-guests.service      loaded failed failed Suspend/Resume Running libvirt Guests
● libvirtd.service            loaded failed failed Virtualization daemon
● nvidia-persistenced.service loaded failed failed NVIDIA Persistence Daemon
● sqwebmail.service           masked failed failed sqwebmail.service
● tpm2-abrmd.service          loaded failed failed TPM2 Access Broker and Resource Management Daemon
● wd_keepalive.service        loaded failed failed LSB: Start watchdog keepalive daemon


To get a full list of objects of systemctl you can pass as state:
 

# systemctl –state=help
Full list of possible load states to pass is here
Show service properties


Check whether a service is failed or has other status and check default set systemd variables for it.

root@jeremiah~:# systemctl is-failed vboxweb.service
inactive

# systemctl show haproxy
Type=notify
Restart=always
NotifyAccess=main
RestartUSec=100ms
TimeoutStartUSec=1min 30s
TimeoutStopUSec=1min 30s
TimeoutAbortUSec=1min 30s
TimeoutStartFailureMode=terminate
TimeoutStopFailureMode=terminate
RuntimeMaxUSec=infinity
WatchdogUSec=0
WatchdogTimestampMonotonic=0
RootDirectoryStartOnly=no
RemainAfterExit=no
GuessMainPID=yes
SuccessExitStatus=143
MainPID=304858
ControlPID=0
FileDescriptorStoreMax=0
NFileDescriptorStore=0
StatusErrno=0
Result=success
ReloadResult=success
CleanResult=success

Full output of the above command is dumped in show_systemctl_properties.txt


3. List all running systemd services for a better overview on what's going on on machine
 

To get a list of all properly systemd loaded services you can use –state running.

hipo@jeremiah:~$ systemctl list-units –state running|head -n 10
  UNIT                              LOAD   ACTIVE SUB     DESCRIPTION
  proc-sys-fs-binfmt_misc.automount loaded active running Arbitrary Executable File Formats File System Automount Point
  cups.path                         loaded active running CUPS Scheduler
  init.scope                        loaded active running System and Service Manager
  session-2.scope                   loaded active running Session 2 of user hipo
  accounts-daemon.service           loaded active running Accounts Service
  anydesk.service                   loaded active running AnyDesk
  apache-htcacheclean.service       loaded active running Disk Cache Cleaning Daemon for Apache HTTP Server
  apache2.service                   loaded active running The Apache HTTP Server
  avahi-daemon.service              loaded active running Avahi mDNS/DNS-SD Stack

 

It is useful thing is to list all unit-files configured in systemd and their state, you can do it with:

 


root@pcfreak:~# systemctl list-unit-files
UNIT FILE                                                                 STATE           VENDOR PRESET
proc-sys-fs-binfmt_misc.automount                                         static          –            
-.mount                                                                   generated       –            
backups.mount                                                             generated       –            
dev-hugepages.mount                                                       static          –            
dev-mqueue.mount                                                          static          –            
media-cdrom0.mount                                                        generated       –            
mnt-sda1.mount                                                            generated       –            
proc-fs-nfsd.mount                                                        static          –            
proc-sys-fs-binfmt_misc.mount                                             disabled        disabled     
run-rpc_pipefs.mount                                                      static          –            
sys-fs-fuse-connections.mount                                             static          –            
sys-kernel-config.mount                                                   static          –            
sys-kernel-debug.mount                                                    static          –            
sys-kernel-tracing.mount                                                  static          –            
var-www.mount                                                             generated       –            
acpid.path                                                                masked          enabled      
cups.path                                                                 enabled         enabled      

 

 


root@pcfreak:~# systemctl list-units –type service –all
  UNIT                                   LOAD      ACTIVE   SUB     DESCRIPTION
  accounts-daemon.service                loaded    inactive dead    Accounts Service
  acct.service                           loaded    active   exited  Kernel process accounting
● alsa-restore.service                   not-found inactive dead    alsa-restore.service
● alsa-state.service                     not-found inactive dead    alsa-state.service
  apache2.service                        loaded    active   running The Apache HTTP Server
● apparmor.service                       not-found inactive dead    apparmor.service
  apt-daily-upgrade.service              loaded    inactive dead    Daily apt upgrade and clean activities
 apt-daily.service                      loaded    inactive dead    Daily apt download activities
  atd.service                            loaded    active   running Deferred execution scheduler
  auditd.service                         loaded    active   running Security Auditing Service
  auth-rpcgss-module.service             loaded    inactive dead    Kernel Module supporting RPCSEC_GSS
  avahi-daemon.service                   loaded    active   running Avahi mDNS/DNS-SD Stack
  certbot.service                        loaded    inactive dead    Certbot
  clamav-daemon.service                  loaded    active   running Clam AntiVirus userspace daemon
  clamav-freshclam.service               loaded    active   running ClamAV virus database updater
..

 


linux-systemd-components-diagram-linux-kernel-system-targets-systemd-libraries-daemons

 

4. Finding out more on why a systemd configured service has failed


Usually getting info about failed systemd service is done with systemctl status servicename.service
However, in case of troubles with service unable to start to get more info about why a service has failed with (-l) or (–full) options


root@pcfreak:~# systemctl -l status logrotate.service
● logrotate.service – Rotate log files
     Loaded: loaded (/lib/systemd/system/logrotate.service; static)
     Active: failed (Result: exit-code) since Fri 2022-02-25 00:00:06 EET; 13h ago
TriggeredBy: ● logrotate.timer
       Docs: man:logrotate(8)
             man:logrotate.conf(5)
    Process: 2045320 ExecStart=/usr/sbin/logrotate /etc/logrotate.conf (code=exited, status=1/FAILURE)
   Main PID: 2045320 (code=exited, status=1/FAILURE)
        CPU: 2.479s

Feb 25 00:00:06 pcfreak logrotate[2045577]: 2022/02/25 00:00:06| WARNING: For now we will assume you meant to write /32
Feb 25 00:00:06 pcfreak logrotate[2045577]: 2022/02/25 00:00:06| ERROR: '0.0.0.0/0.0.0.0' needs to be replaced by the term 'all'.
Feb 25 00:00:06 pcfreak logrotate[2045577]: 2022/02/25 00:00:06| SECURITY NOTICE: Overriding config setting. Using 'all' instead.
Feb 25 00:00:06 pcfreak logrotate[2045577]: 2022/02/25 00:00:06| WARNING: (B) '::/0' is a subnetwork of (A) '::/0'
Feb 25 00:00:06 pcfreak logrotate[2045577]: 2022/02/25 00:00:06| WARNING: because of this '::/0' is ignored to keep splay tree searching predictable
Feb 25 00:00:06 pcfreak logrotate[2045577]: 2022/02/25 00:00:06| WARNING: You should probably remove '::/0' from the ACL named 'all'
Feb 25 00:00:06 pcfreak systemd[1]: logrotate.service: Main process exited, code=exited, status=1/FAILURE
Feb 25 00:00:06 pcfreak systemd[1]: logrotate.service: Failed with result 'exit-code'.
Feb 25 00:00:06 pcfreak systemd[1]: Failed to start Rotate log files.
Feb 25 00:00:06 pcfreak systemd[1]: logrotate.service: Consumed 2.479s CPU time.


systemctl -l however is providing only the last log from message a started / stopped or whatever status service has generated. Sometimes systemctl -l servicename.service is showing incomplete the splitted error message as there is a limitation of line numbers on the console, see below

 

root@pcfreak:~# systemctl status -l certbot.service
● certbot.service – Certbot
     Loaded: loaded (/lib/systemd/system/certbot.service; static)
     Active: failed (Result: exit-code) since Fri 2022-02-25 09:28:33 EET; 4h 0min ago
TriggeredBy: ● certbot.timer
       Docs: file:///usr/share/doc/python-certbot-doc/html/index.html
             https://certbot.eff.org/docs
    Process: 290017 ExecStart=/usr/bin/certbot -q renew (code=exited, status=1/FAILURE)
   Main PID: 290017 (code=exited, status=1/FAILURE)
        CPU: 9.771s

Feb 25 09:28:33 pcfrxen certbot[290017]: The error was: PluginError('An authentication script must be provided with –manual-auth-hook when using th>
Feb 25 09:28:33 pcfrxen certbot[290017]: All renewals failed. The following certificates could not be renewed:
Feb 25 09:28:33 pcfrxen certbot[290017]:   /etc/letsencrypt/live/mail.pcfreak.org-0003/fullchain.pem (failure)
Feb 25 09:28:33 pcfrxen certbot[290017]:   /etc/letsencrypt/live/www.eforia.bg-0005/fullchain.pem (failure)
Feb 25 09:28:33 pcfrxen certbot[290017]:   /etc/letsencrypt/live/zabbix.pc-freak.net/fullchain.pem (failure)
Feb 25 09:28:33 pcfrxen certbot[290017]: 3 renew failure(s), 5 parse failure(s)
Feb 25 09:28:33 pcfrxen systemd[1]: certbot.service: Main process exited, code=exited, status=1/FAILURE
Feb 25 09:28:33 pcfrxen systemd[1]: certbot.service: Failed with result 'exit-code'.
Feb 25 09:28:33 pcfrxen systemd[1]: Failed to start Certbot.
Feb 25 09:28:33 pcfrxen systemd[1]: certbot.service: Consumed 9.771s CPU time.

 

5. Get a complete log of journal to make sure everything configured on server host runs as it should

Thus to get more complete list of the message and be able to later google and look if has come with a solution on the internet  use:

root@pcfrxen:~#  journalctl –catalog –unit=certbot

— Journal begins at Sat 2022-01-22 21:14:05 EET, ends at Fri 2022-02-25 13:32:01 EET. —
Jan 23 09:58:18 pcfrxen systemd[1]: Starting Certbot…
░░ Subject: A start job for unit certbot.service has begun execution
░░ Defined-By: systemd
░░ Support: https://www.debian.org/support
░░ 
░░ A start job for unit certbot.service has begun execution.
░░ 
░░ The job identifier is 5754.
Jan 23 09:58:20 pcfrxen certbot[124996]: Traceback (most recent call last):
Jan 23 09:58:20 pcfrxen certbot[124996]:   File "/usr/lib/python3/dist-packages/certbot/_internal/renewal.py", line 71, in _reconstitute
Jan 23 09:58:20 pcfrxen certbot[124996]:     renewal_candidate = storage.RenewableCert(full_path, config)
Jan 23 09:58:20 pcfrxen certbot[124996]:   File "/usr/lib/python3/dist-packages/certbot/_internal/storage.py", line 471, in __init__
Jan 23 09:58:20 pcfrxen certbot[124996]:     self._check_symlinks()
Jan 23 09:58:20 pcfrxen certbot[124996]:   File "/usr/lib/python3/dist-packages/certbot/_internal/storage.py", line 537, in _check_symlinks

root@server:~# journalctl –catalog –unit=certbot|grep -i pluginerror|tail -1
Feb 25 09:28:33 pcfrxen certbot[290017]: The error was: PluginError('An authentication script must be provided with –manual-auth-hook when using the manual plugin non-interactively.')


Or if you want to list and read only the last messages in the journal log regarding a service

root@server:~# journalctl –catalog –pager-end –unit=certbot


If you have disabled a failed service because you don't need it to run at all on the machine with:

root@rhel:~# systemctl stop rngd.service
root@rhel:~# systemctl disable rngd.service

And you want to clear up any failed service information that is kept in the systemctl service log you can do it with:
 

root@rhel:~# systemctl reset-failed

Another useful systemctl option is cat, you can use it to easily list a service it is useful to quickly check what is a service, an actual shortcut to save you from giving a full path to the service e.g. cat /lib/systemd/system/certbot.service

root@server:~# systemctl cat certbot
# /lib/systemd/system/certbot.service
[Unit]
Description=Certbot
Documentation=file:///usr/share/doc/python-certbot-doc/html/index.html
Documentation=https://certbot.eff.org/docs
[Service]
Type=oneshot
ExecStart=/usr/bin/certbot -q renew
PrivateTmp=true


After failed SystemD services are fixed, it is best to reboot the machine and check put some more time to inspect rawly the complete journal log to make sure, no error  was left behind.


Closure
 

As you can see updating a machine from a major to a major version even if you follow the official documentation and you have plenty of experience is always more or a less a pain in the ass, which can eat up much of your time banging your head solving problems with failed daemons issues with /etc/rc.local (which I have faced becase of #/bin/sh -e (which would make /etc/rc.local) to immediately quit if any error from command $? returns different from 0 etc.. The  logical questions comes then;
1. Is it really worthy to update at all regularly, especially if you don't know of a famous major Vulnerability 🙂 ?
2. Or is it worthy to update from OS major release to OS major release at all?  
3. Or should you only try to patch the service that is exposed to an external reachable computer network or the internet only and still the the same OS release until End of Life (LTS = Long Term Support) as called in Debian or  End Of Life  (EOL) Cycle as called in RPM based distros the period until the OS major release your software distro has official security patches is reached.

Anyone could take any approach but for my own managed systems small network at home my practice was always to try to keep up2date everything every 3 or 6 months maximum. This has caused me multiple days of irritation and stress and perhaps many white hairs and spend nerves on shit.


4. Based on the company where I'm employed the better strategy is to patch to the EOL is still offered and keep the rule First Things First (FTF), once the EOL is reached, just make a copy of all servers data and configuration to external Data storage, bring up a new Physical or VM and migrate the services.
Test after the migration all works as expected if all is as it should be change the DNS records or Leading Infrastructure Proxies whatever to point to the new service and that's it! Yes it is true that migration based on a full OS reinstall is more time consuming and requires much more planning, but usually the result is much more expected, plus it is much less stressful for the guy doing the job.

Speed up WordPress / Joomla CMS and MySQL server on Linux with tmpfs ram file system / Decrease Website pageload times with RAM caching

Wednesday, March 4th, 2015

speed-up-accelerate-wordpress-joomla-drupal-cms-and-mysql-server-with-tmpfs_ramfs_decrease-pageload-times-with-ram-caching
As a WordPress blog owner and an sys admin that has to deal with servers running a lot of WordPress / Joomla / Droopal and other custom CMS installed on servers, performoing slow or big enough to put a significant load on servers
and I love efficiency and hardware cost saving is essential for my daily job, I'm constantly trying to find new ways to optimize Customer Website (WordPress) and rest of sites in order to utilize better our servers and improve our clients sites speed (and hence satisfaction). 

There is plenty of little things to do on servers but probably among the most crucial ones which we use nowadays that save us a lot of money is tmpfs, and earlier (ramfs) – previously known as shmfs).
TMPFS is a (Temporary File Storage Facility) Linux kernel technology based on ramfs (used by Linux kernel initrd / initramfs on boot time in order to load and store the Linux kernel in memory, before system hard disk partition file systems are mounted) which is heavily used by virtually all modern popular Linux distributions. 

Using ramfs (cramfs variation – Compressed ROM filesystem) has been used to store different system environment kernel and Desktop components of many Linux environment / applications and used by a lot of the Linux BootCD such as the most famous (Klaus Knopper's) KNOPPIX LiveCD and Trinity Rescue Kit Linux (TRK uses /dev/shm which btw can be seen on most modern Linux distros and is actually just another mounted tmpfs).
If you haven't tried Live Linux yet try it out as me and a lot of sysadmins out there use some kind of LiveLinux at least few times on yearly basis  to Recover Unbootable Linux servers after some applied remote Updates as well as for Rescuing (Save) Data from Linux server failing to properly boot because of hard disk (bad blocks) failures. As I said earlier TMPFS is also used on almost any distribution for the /dev/ filesystem which is kept in memory.

You can see which tmpfs partitions is used on your Linux server with:

 

debian-server:~# mount |grep -i tmpfs
tmpfs on /lib/init/rw type tmpfs (rw,nosuid,mode=0755)
udev on /dev type tmpfs (rw,mode=0755)
tmpfs on /dev/shm type tmpfs (rw,nosuid,nodev)

 

Above is an output from a standard Debian Linux server. On CentOS 7 standard mounted tmpfs are as follows:

 

[root@centos ~]# mount |grep -i tmpfs
devtmpfs on /dev type devtmpfs (rw,nosuid,seclabel,size=1016332k,nr_inodes=254083,mode=755)
tmpfs on /dev/shm type tmpfs (rw,nosuid,nodev,seclabel)
tmpfs on /run type tmpfs (rw,nosuid,nodev,seclabel,mode=755)
tmpfs on /sys/fs/cgroup type tmpfs (rw,nosuid,nodev,noexec,seclabel,mode=755)

 

[root@centos ~]# df -h|grep -i tmpfs
devtmpfs                 993M     0  993M   0% /dev
tmpfs                   1002M   92K 1002M   1% /dev/shm
tmpfs                   1002M  8.8M  993M   1% /run
tmpfs                   1002M     0 1002M   0% /sys/fs/cgroup

The /run tmpfs mounted directory is also to be seen also on latest Ubuntus and Fedoras and is actually the good old /var/run ( where applications keep there pids and some small app related files) stored in tmpfs filesystem stored in memory.

If you're wondering what is /dev/shm and why it appears mounted on every single Linux Server / Desktop you've ever used this is a special filesystem shared memory which various running programs (processes) can use to transfer data quick and efficient between each other to preven the slow disk swapping. People using Linux for the rest 15 years should remember /dev/shm has been a target of a lot of kernel exploits as historically it had a lot of security issues.

While writting this article I've just checked about KNOPPIX developed amd just for info as of time of writting this distro has already 1000+ programs on CD version and 2600+ packages / application on DVD version.
Nowadays Knoppix is mostly used mostly as USB Live Flash drive as a lot of people are dropping CD / DVD use (many servers doesn't have a CD / DVD Drive) and for USB Live Flash Linux distros tmpfs is also key technology used as this gives the end user an amazing fast experience (Desktop applications run much fasten on Live USBs when tmpfs is used than when the slow 7200 RPM HDDs are used).

Loading big parts of the distribution within RAM (with tmpfs from Linux Kernel 2.4+ onwards) is also heavily used by a lot of Cluster vendors in most of Clustered (Cloud) Linux based environemnts, cause TMPFS gives often speeds up improvements to x30 times and decreases greatly I/O HDD. FreeBSD users will be happy to know that TMPFS is already ported and could be used on from FreeBSD 7.0+ onward.

In this small article I will give you example use on how I use tmpfs to speed up our WordPress Websites which use WP Caching plugins such as W3 Total Cache and WP Super Cache
and Hyper Cache / WP Super Cache disk caching and MySQL server as a Database backend.
Below example is wordpress specific but since it can be easily applied to JoomlaDrupal or any other CMS out there that uses mySQL server to make a lot of CPU expensive memory hungry (LEFT JOIN) queries which end up using a slow 7200 RPM hard disk.


 

1. Preparing tmpfs partitions for WordPress File Cache directory
 

If you want to give tmpfs a test drive, I recommend you try to create / mount a 20 Megabyte partition. To create a tmpfs partition you don't need to use a tool like mkfs.ext3 / mkfs.ext4 as TMPFS is in reality a virtual filesystem that is mapped in the server system physical RAM (volatile memory). TMPFS is very nice because if you run out of free RAM system starts a combination of RAM use + some Hard disk SWAP 
The great thing about TMPFS is it never uses all of the available RAM and SWAP, which would not halt your server if TMPFS partition gets filled, but instead you will start getting the usual "Insufficient Disk Space", just like with a physical HDD parititon. RAMFS cares much less about server compared to TMPFS, because if RAMFS is historically older.

ramfs file systems cannot be limited in size like a disk base file system which is limited by it’s capacity, thus ramfs will continue using memory storage until the system runs out of RAM and likely crashes or becomes unresponsive. This is a problem if the application writing to the file system cannot be limited in total size, so in my opinion you better stay away from RAMFS except you have a good idea what you're doing. Another disadvantage of RAMFS compared to TMPFS is you cannot see the size of the file system in df and it can only be estimated by looking at the cached entry in free.

Note that before proceeding to use TMPFS or RAMFS you should know besides having advantages, there are certain serious disadvantage that if the server using tmpfs (in RAM) to store files crashes the customer might loose his data, therefore using RAM filesystems on Production servers is best to be used just for caching folders which are regularly synchronized with (rsync) to some folder to assure no data will be lost on server reboot or crash.

Memory of fast storage areas are ideally suited for applications which need repetitively small data areas for caching or using as temporary space such as Jira (Issue and Proejct Tracking Software) Indexing  As the data is lost when the machine reboots the tmpfs stored data must not be data of high importance as even scheduling backups cannot guarantee that all the data will be replicated in the even of a system crash.

To test mounting a tmpfs virtual (memory stored) filesystem issue:
 

mount -t tmpfs tmpfs -o size=256m /mnt/tmpfs


If you want to test mount a ramfs instead:

 

 mount -t ramfs -o size=256m ramfs /mnt/ramfs

 

debian-server:~#  mount |grep -i -E "ramfs|tmpfs"
tmpfs on /lib/init/rw type tmpfs (rw,nosuid,mode=0755)
udev on /dev type tmpfs (rw,mode=0755)
tmpfs on /dev/shm type tmpfs (rw,nosuid,nodev)
tmpfs on /mnt/tmpfs type tmpfs (rw,size=256m)
ramfs on /mnt/ramfs type tmpfs (rw,size=256m)

 

Once mounted tmpfs can be used in the same way as any ext4 / reiserfs filesystem. In the same way to make mounts permanent, its necessery to add a line to /etc/fstab

To illustrate better a tmpfs use case on my blog running WordPress with W3TotalCache (W3TC) plugin cache folder in /var/www/blog/wp-content/w3tc to get advantage of tmpfs to store w3tc files.

a) Stop Apache

On Debian
 

debian-server:~# /etc/init.d/apache stop


On CentOS 
 

[root@centos ~]# /etc/init.d/httpd stop


b) Move w3tc dir to w3tc-bak

 

debian-server:~# cd /var/www/blog/wp-content/
debian-server:~# mv w3tc w3tc-bak

 

c) Create w3tc directory
 

debian-server:/var/www/blog/wp-content# mkdir w3tc
debian-server:/var/www/blog/wp-content# chown -R www-data:www-data w3tc


d) Add tmpfs record to /etc/fstab

My W3TC Cache didn't grow bigger than 2Gigabytes so I create a 2Giga directory for it by adding following in /etc/fstab 
 

debian-server:~# vim /etc/fstab

 

tmpfs /var/www/blog/wp-content/w3tc tmpfs defaults,size=2g,noexec,nosuid,uid=33,gid=33,mode=1755 0 0


You might also want to add the nr_inodes (option) to tmpfs while mounting. nr_inodes is the maximum inode for instance. Default is half the number of your physical RAM pages, (on a machine with highmem) the number of lowmem RAM page, some common option that should work is nr_inodes=5k, if you're unsure what this option does you can safely skip it 🙂

e) Mount new added tmpfs folder

Then to mount the newly added filesystem issue:
 

mount -a


Or if you're on a CentOS / RHEL server use httpd Apache user instead and whenever you have docroot and wordpress installed.

 

[root@centos ~]# chown -R apache:apache: w3tc


If you're using Apache SuPHP use whatever the UID / GID is proper.

On CentOS you will need to set proper UID and GID (UserID / GroupID), to find out which ones to to use check in /etc/passwd:
 

[root@centos ~]# grep -i apache /etc/passwd
apache:x:48:48:Apache:/var/www:/sbin/nologin


f) Move old w3tc cache from w3tc-bak to w3tc

 

debian-server:/var/www/blog/wp-content# mv w3tc-bak/* w3tc/

 

g) Start again Apache

On Debian:

 

debian-server:~# /etc/init.d/apache2 start

 


On CentOS:
 

[root@centos~]# /etc/init.d/httpd start

h) Keeping w3tc cache site folder synced

As I said earlier the biggest problem with caching (the reason why many hosting providers) and site admins refuse to use it is they might loose some data, to prevent data loss or at least mitigate the data loss to few minutes intervals it is a good idea to synchronize tmpfs kept folders somewhere to disk with rsync.

To achieve that use a cronjob like this:
 

debian-server:~# crontab -u root -e
*/5 * * * * /usr/bin/ionice -c3 -n7 /usr/bin/nice -n 19 /usr/bin/rsync -ah –stats –delete /var/www/blog/wp-content/w3tc/ /backups/tmpfs/cache/ 1>/dev/null


Note that you will need to have the /backups/tmpfs/cache folder existing, create it with:

 

debian-server:~# mkdir -p /backups/tmpfs/cache


You will also need to add a rsync synchronization from backupped folder to tmpfs (in case if the server gets accidently rebooted because it hanged or power outage), place in

/etc/rc.local

 

ionice -c3 -n7 nice -n 19 rsync -ahv –stats –delete /backups/tmpfs/cache/ /var/www/blog/wp-content/w3tc/ 1>/dev/null


(somewhere before exit 0) line
 

0 05 * * * /usr/bin/ionice -c3 -n7 /bin/nice -n 19 /usr/bin/rsync -ah –stats –delete /var/www/blog/wp-content/w3tc/ /backups/tmpfs/cache/ 1>/dev/null

 

 

2. Preparing tmpfs partitions for MySQL server temp File Cache directory


Its common that MySQL servers had to serve a lot of long and heavy SQL JOIN Queries mostly by related posts WP plugins such as (Zemanta Related Posts) and Contextual Related posts though MySQLs are well optimized  to work as much as efficient using mysql tuner (tuning primer) still often SQL servers get a lot of temp tables created to disk (about 25% to 30%) of all SQL queries use somehow HDD to serve queries and as this is very slow and there is file lock created the overall MySQL performance becomes sluggish at times to fix (resolve) that without playing with SQL code to optimize the slow queries the best way I found is by using TMPFS as MySQL temp folder.

To do so I create a TMPFS usually the size of 256 MB because this is usually enough for us, but other hosting companies might want to add bigger virtual temp disk:

a) Add tmpfs new dir to /etc/fstab

In /etc/fstab add below record with vim editor:
 

debian-server:~# vim /etc/fstab

 

tmpfs /var/mysqltmp tmpfs rw,gid=111,uid=108,size=256M,nr_inodes=10k,mode=0700 0 0

 

Note that the uid / and gid 105 and 114 are taken again from /etc/passwd

On Debian

debian-server:~# grep -i mysql /etc/passwd
mysql:x:108:111:MySQL Server,,,:/var/lib/mysql:/bin/false


On CentOS
 

[root@centos ~]# grep -i mysql /etc/passwd
mysql:x:27:27:MySQL Server:/var/lib/mysql:/bin/bash


b) Create folder /var/mysqltmp or whenever you want to place the tmpfs memory kept SQL folder

 

debian-server:~# mkdir /var/mysqltmp
debian-server:~# chown mysql:mysql /var/mysqltmp

 

debian-server:~# mount|grep -i tmpfs
tmpfs on /lib/init/rw type tmpfs (rw,nosuid,mode=0755)
udev on /dev type tmpfs (rw,mode=0755)
tmpfs on /dev/shm type tmpfs (rw,nosuid,nodev)
tmpfs on /var/www/blog/wp-content/w3tc type tmpfs (rw,noexec,nosuid,size=2g,uid=33,gid=33,mode=1755)
tmpfs on /var/mysqltmp type tmpfs (rw,gid=108,uid=111,size=256M,nr_inodes=10k,mode=0700)


c) Add new path to tmpfs created folder in my.cnf 

Then  edit /etc/mysql/my.cnf

 

debian-server:~# vim /etc/mysql/my.cnf

[mysqld]
#
# * Basic Settings
#
user        = mysql
pid-file    = /var/run/mysqld/mysqld.pid
socket      = /var/run/mysqld/mysqld.sock
port        = 3306
basedir     = /usr
datadir     = /var/lib/mysql
tmpdir      = /var/mysqltmp

 

On CentOS edit and change tmpdir in same way within /etc/my.cnf


d) Finally Restart Apache and MySQL to make mysql start using new set tmpfs memory kept folder

On Debian:
 

debian-server:~# /etc/init.d/apache2 stop; /etc/init.d/mysql restart; /etc/init.d/apache2 start

On CentOS:
 

[root@centos ~]# /etc/init.d/httpd stop; /etc/init.d/mysqld restart; /etc/initd/httpd start


Now monitor your server and check your pagespeed increase for me such an optimization usually improves site performance so site becomes +50% faster, to see the difference you can test your website before applying tmpfs caching for site and after that by using Google PageInsight (PageSpeed) Online Test. Though this example is for MySQL and WordPress you can easily adopt the same for Joomla if you have Joomla Caching enabled to some folder, same goes for any other CMS such as Drupal that can take use of Disk Caching. Actually its a small secret of many Hosting providers that allow clients to create sites via CPanel and Kloxo this tmpfs optimizations are already used for sites and by this the provider is able to offer better website service on lower prices. VPS hosting providers also use heavy caching. A lot of people are using TMPFS also to accelerate Sites that have enabled Google Pagespeed as Cacher and accelerator, as PageSpeed module puts a heavy HDD I/O load that can easily stone the server. Many admins also choose to use TMPFS for  /tmp, /var/run, and /var/lock directories as this leads often to significant overall server services operations improvement.
Once you have tmpfs enabled, It is a good idea to periodically monitor your SWAP used space with (df -h), because if you allocate bigger tmpfs partitions than your physical memory and tmpfs's full size starts to be used your machine will start swapping heavily and this could have a very negative performance affect.
 

debian-server:~# df -h|grep -i tmpfs
tmpfs            3,9G     0   3,9G   0% /lib/init/rw
tmpfs            3,9G     0   3,9G   0% /dev/shm
tmpfs            2,0G  1,4G   712M  66% /var/www/blog/wp-content/w3tc
tmpfs            256M     0   256M   0% /mnt/tmpfs
tmpfs            256M  236K   256M   1% /var/mysqltmp

The applications of tmpfs to accelerate services is up to your imagination, so I will be glad to hear from other admins on any interesting other application or problems faced while using TMPFS.

 Enjoy! 🙂

swap_pager_getswapspace: failed, MySQL troubles on FreeBSD 7.2 cause and solution

Tuesday, May 3rd, 2011

Every now and then my FreeBSD router dmesg ( /var/log/dmesg.today ) logs, gets filled with error messages like:

pid 86369 (httpd), uid 80, was killed: out of swap space
swap_pager_getswapspace(14): failed
swap_pager_getswapspace(16): failed
swap_pager_getswapspace(11): failed
swap_pager_getswapspace(12): failed
swap_pager_getswapspace(16): failed
swap_pager_getswapspace(16): failed
swap_pager_getswapspace(16): failed
swap_pager_getswapspace(16): failed
swap_pager_getswapspace(14): failed
swap_pager_getswapspace(16): failed
swap_pager_getswapspace(8): failed

Using swapinfo during the swap_pager_getswapspace(16): failed messages were logged in, I figured out that definitely the swap memory over-use is the bottleneck for the troubles, to find this I used the command:

freebsd# swapinfo
Device 1K-blocks Used Avail Capacity Type
/dev/ad0s1b 49712 45920 3792 92% Interleaved

After some investigation, I’ve figured out that the MySQL server is causing the kernel exceeded swap troubles.

My current MySQL server version is installed from the ports tree, whether I’m using the bsd port /usr/ports/databases/mysql51-server/ and it appears to work just fine.

However I have noticed that the mysql-server is missing a my.cnf file!, which means the mysql server is running under a mode with some kind of default configurations.

Strangely in the system process list it appeared it is using a default my.cnf file located in /var/db/mysql/my.cnf

Below you see the paste from the ps command:

ps axuww freebsd# ps axuww | grep -i my.cnf | grep -v grep
mysql 7557 0.0 0.1 3464 1268 p1 I 12:03PM 0:00.01 /bin/sh /usr/local/bin/mysqld_safe --defaults-extra-file=/var/db/mysql/my.cnf --user=mysql --datadir=/var/db/mysql --pid-file=/var/db/mysql/pcfreak.pidmysql 7589 0.0 5.1 93284 52852 p1 I 12:03PM 0:59.01 /usr/local/libexec/mysqld --defaults-extra-file=/var/db/mysql/my.cnf --basedir=/usr/local --datadir=/var/db/mysql --user=mysql --pid-file=/var/db/mysql/pcfreak.pid --port=3306 --socket=/tmp/mysql.sock

Nevertheless it appeared the sql server is running the file /var/db/mysql/my.cnf conf was not existing! This was really weird for me as I’m used to have the default my.cnf from my previous experience with Linux servers!

Thus the next logical thing I did was to create my.cnf conf file in order to be able to have a proper limiting configuration for the sql server.

The FreeBSD my.cnf skele files are found in /usr/local/share/mysql/, here are the 4 files one can use as a starting basis for further configuration of the mysql-server.

freebsd# ls -al /usr/local/share/mysql/my-*.cnf
-r--r--r-- 1 root wheel 4948 Aug 12 2009 /usr/local/share/mysql/my-huge.cnf
-r--r--r-- 1 root wheel 20949 Aug 12 2009 /usr/local/share/mysql/my-innodb-heavy-4G.cnf
-r--r--r-- 1 root wheel 4924 Aug 12 2009 /usr/local/share/mysql/my-large.cnf
-r--r--r-- 1 root wheel 4931 Aug 12 2009 /usr/local/share/mysql/my-medium.cnf
-r--r--r-- 1 root wheel 2502 Aug 12 2009 /usr/local/share/mysql/my-small.cnf

I have chosen to use the my-medium.cnf as a skele to tune up, as my server is not high iron one e.g. the host I run the mysql is a (simple dual core 1.2Ghz system).

Further on I copied the /usr/local/share/mysql/my-medium.cnf to /var/db/mysql/my.cnf e.g.:

freebsd# cp -rpf /usr/local/share/mysql/my-medium.cnf /var/db/mysql/my.cnf

As a next step to properly tune up the default values of the newly copied my.cnf to my specific server I used the Tuning-Primer MySQL tuning script

Using tuning-primer.sh is really easy as all I did is download, launch it and follow the script suggestions to correct some of the values already in my.cnf

I have finally ended up with the following my.cnf after using tuning-primer.sh to optimize mysql server to work with my bsd host

Now I really hope the shitty swap_pager_getswapspace: failed errors would not haunt me once again by crashing my server and causing mem overheads.

Still I wonder why the port developer Alex Dupre – ale@FreeBSD.org choose not to provide the default mysql51-server conf with some kind of my.cnf file? I hope he had a good reason.

How to debug mod_rewrite .htaccess problems with RewriteLog / Solve mod_rewrite broken redirects

Friday, September 30th, 2011

Its common thing that CMS systems and many developers custom .htaccess cause issues where websites depending on mod_rewrite fails to work properly. Most common issues are broken redirects or mod_rewrite rules, which behave differently among the different mod_rewrite versions which comes with different versions of Apache.

Everytime there are such problems its necessery that mod_rewrite’s RewriteLog functionality is used.
Even though the RewriteLog mod_rewrite config variable is well described on httpd.apache.org , I decided to drop a little post here as I’m pretty sure many novice admins might not know about RewriteLog config var and might benefit of this small article.
Enabling mod_rewrite requests logging of requests to the webserver and process via mod_rewrite rules is being done either via the specific website .htaccess (located in the site’s root directory) or via httpd.conf, apache2.conf etc. depending on the Linux / BSD linux distribution Apache config file naming is used.

To enable RewriteLog near the end of the Apache configuration file its necessery to place the variables in apache conf:

1. Edit RewriteLog and place following variables:

RewriteLogLevel 9
RewriteLog /var/log/rewrite.log

RewriteLogLevel does define the level of logging that should get logged in /var/log/rewrite.log
The higher the RewriteLogLevel number defined the more debugging related to mod_rewrite requests processing gets logged.
RewriteLogLevel 9 is actually the highest loglevel that can be. Setting the RewriteLogLevel to 0 will instruct mod_rewrite to stop logging. In many cases a RewriteLogLevel of 3 is also enough to debug most of the redirect issues, however I prefer to see more, so almost always I use RewriteLogLevel of 9.

2. Create /var/log/rewrite.log and set writtable permissions

a. Create /var/log/rewrite.log

freebsd# touch /var/log/rewrite.log

b. Set writtable permissons

Either chown the file to the user with which the Apache server is running, or chmod it to permissions of 777.

On FreeBSD, chown permissions to allow webserver to write in file, should be:

freebsd# chown www:www /var/log/rewrite.log

On Debian and alike distros:

debian:~# chown www-data:www-data /var/log/rewrite.log

On CentOS, Fedora etc.:

[root@centos ~]# chown httpd:httpd /var/log/rewrite.log

On any other distribution, you don’t want to bother to check the uid:gid, the permissions can be set with chmod 777, e.g.:

linux# chmod 777 /var/log/rewrite.log

Next after RewriteLog is in conf to make configs active the usual webserver restart is required.

To restart Apache On FreeBSD:

freebsd# /usr/local/etc/rc.d/apache2 restart
...

To restart Apache on Debian and derivatives:

debian:~# /etc/init.d/apache2 restart
...

On Fedora and derivive distros:

[root@fedora ~]# /etc/init.d/httpd restart
...

Its common error to forget to set proper permissions to /var/log/rewrite.log this has puzzled me many times, when enabling RewriteLog’s logging.

Another important note is when debugging for mod_rewrite is enabled, one forgets to disable logging and after a while if the /var/log partition is placed on a small partition or is on an old server with less space often the RewriteLog fills in the disk quickly and might create website downtimes. Hence always make sure RewriteLog is disabled after work rewrite debugging is no longer needed.

The way I use to disable it is by commenting it in conf like so:

#RewriteLogLevel 9
#RewriteLog /var/log/rewrite.log

Finally to check, what the mod_rewrite processor is doing on the fly its handy to use the well known tail -f

linux# tail -f /var/log/rewrite.log

A bunch of time in watching the requests, should be enough to point to the exact problem causing broken redirects or general website malfunction.
Cheers 😉