Posts Tagged ‘log messages’

List and fix failed systemd failed services after Linux OS upgrade and how to get full info about systemd service from jorunal log

Friday, February 25th, 2022

systemd-logo-unix-linux-list-failed-systemd-services

I have recently upgraded a number of machines from Debian 10 Buster to Debian 11 Bullseye. The update as always has some issues on some machines, such as problem with package dependencies, changing a number of external package repositories etc. to match che Bullseye deb packages. On some machines the update was less painful on others but the overall line was that most of the machines after the update ended up with one or more failed systemd services. It could be that some of the machines has already had this failed services present and I never checked them from the previous time update from Debian 9 -> Debian 10 or just some mess I've left behind in the hurry when doing software installation in the past. This doesn't matter anyways the fact was that I had to deal to a number of systemctl services which I managed to track by the Failed service mesage on system boot on one of the physical machines and on the OpenXen VTY Console the rest of Virtual Machines after update had some Failed messages. Thus I've spend some good amount of time like an overall of a day or two fixing strange failed services. This is how this small article was born in attempt to help sysadmins or any home Linux desktop users, who has updated his Debian Linux / Ubuntu or any other deb based distribution but due to the chaotic nature of Linux has ended with same strange Failed services and look for a way to find the source of the failures and get rid of the problems. 
Systemd is a very complicated system and in my many sysadmin opinion it makes more problems than it solves, but okay for today's people's megalomania mindset it matches well.

Systemd_components-systemd-journalctl-cgroups-loginctl-nspawn-analyze.svg

 

1. Check the journal for errors, running service irregularities and so on
 

First thing to do to track for errors, right after the update is to take some minutes and closely check,, the journalctl for any strange errors, even on well maintained Unix machines, this journal log would bring you to a problem that is not fatal but still some process or stuff is malfunctioning in the background that you would like to solve:
 

root@pcfreak:~# journalctl -x
Jan 10 10:10:01 pcfreak CRON[17887]: pam_unix(cron:session): session closed for user root
Jan 10 10:10:01 pcfreak audit[17887]: USER_END pid=17887 uid=0 auid=0 ses=340858 subj==unconfined msg='op=PAM:session_close grantors=pam_loginuid,pam_env,pam_env,pam_permit>
Jan 10 10:10:01 pcfreak audit[17888]: CRED_DISP pid=17888 uid=0 auid=0 ses=340860 subj==unconfined msg='op=PAM:setcred grantors=pam_permit acct="root" exe="/usr/sbin/cron" >
Jan 10 10:10:01 pcfreak CRON[17888]: pam_unix(cron:session): session closed for user root
Jan 10 10:10:01 pcfreak audit[17888]: USER_END pid=17888 uid=0 auid=0 ses=340860 subj==unconfined msg='op=PAM:session_close grantors=pam_loginuid,pam_env,pam_env,pam_permit>
Jan 10 10:10:01 pcfreak audit[17884]: CRED_DISP pid=17884 uid=0 auid=0 ses=340855 subj==unconfined msg='op=PAM:setcred grantors=pam_permit acct="root" exe="/usr/sbin/cron" >
Jan 10 10:10:01 pcfreak CRON[17884]: pam_unix(cron:session): session closed for user root
Jan 10 10:10:01 pcfreak audit[17884]: USER_END pid=17884 uid=0 auid=0 ses=340855 subj==unconfined msg='op=PAM:session_close grantors=pam_loginuid,pam_env,pam_env,pam_permit>
Jan 10 10:10:01 pcfreak audit[17886]: CRED_DISP pid=17886 uid=0 auid=33 ses=340859 subj==unconfined msg='op=PAM:setcred grantors=pam_permit acct="www-data" exe="/usr/sbin/c>
Jan 10 10:10:01 pcfreak CRON[17886]: pam_unix(cron:session): session closed for user www-data
Jan 10 10:10:01 pcfreak audit[17886]: USER_END pid=17886 uid=0 auid=33 ses=340859 subj==unconfined msg='op=PAM:session_close grantors=pam_loginuid,pam_env,pam_env,pam_permi>
Jan 10 10:10:08 pcfreak NetworkManager[696]:  [1641802208.0899] device (eth1): carrier: link connected
Jan 10 10:10:08 pcfreak kernel: r8169 0000:03:00.0 eth1: Link is Up – 100Mbps/Full – flow control rx/tx
Jan 10 10:10:08 pcfreak kernel: r8169 0000:03:00.0 eth1: Link is Down
Jan 10 10:10:19 pcfreak NetworkManager[696]:
 [1641802219.7920] device (eth1): carrier: link connected
Jan 10 10:10:19 pcfreak kernel: r8169 0000:03:00.0 eth1: Link is Up – 100Mbps/Full – flow control rx/tx
Jan 10 10:10:20 pcfreak kernel: r8169 0000:03:00.0 eth1: Link is Down
Jan 10 10:10:22 pcfreak NetworkManager[696]:
 [1641802222.2772] device (eth1): carrier: link connected
Jan 10 10:10:22 pcfreak kernel: r8169 0000:03:00.0 eth1: Link is Up – 100Mbps/Full – flow control rx/tx
Jan 10 10:10:23 pcfreak kernel: r8169 0000:03:00.0 eth1: Link is Down
Jan 10 10:10:33 pcfreak sshd[18142]: Unable to negotiate with 66.212.17.162 port 19255: no matching key exchange method found. Their offer: diffie-hellman-group14-sha1,diff>
Jan 10 10:10:41 pcfreak NetworkManager[696]:
 [1641802241.0186] device (eth1): carrier: link connected
Jan 10 10:10:41 pcfreak kernel: r8169 0000:03:00.0 eth1: Link is Up – 100Mbps/Full – flow control rx/tx

If you want to only check latest journal log messages use the -x -e (pager catalog) opts

root@pcfreak;~# journalctl -xe

Feb 25 13:08:29 pcfreak audit[2284920]: USER_LOGIN pid=2284920 uid=0 auid=4294967295 ses=4294967295 subj==unconfined msg='op=login acct=28696E76616C>
Feb 25 13:08:29 pcfreak sshd[2284920]: Received disconnect from 177.87.57.145 port 40927:11: Bye Bye [preauth]
Feb 25 13:08:29 pcfreak sshd[2284920]: Disconnected from invalid user ubuntuuser 177.87.57.145 port 40927 [preauth]

Next thing to after the update was to get a list of failed service only.


2. List all systemd failed check services which was supposed to be running

root@pcfreak:/root # systemctl list-units | grep -i failed
● certbot.service                                                                                                       loaded failed failed    Certbot
● logrotate.service                                                                                                     loaded failed failed    Rotate log files
● maldet.service                                                                                                        loaded failed failed    LSB: Start/stop maldet in monitor mode
● named.service                                                                                                         loaded failed failed    BIND Domain Name Server


Alternative way is with the –failed option

hipo@jeremiah:~$ systemctl list-units –failed
  UNIT                        LOAD   ACTIVE SUB    DESCRIPTION
● haproxy.service             loaded failed failed HAProxy Load Balancer
● libvirt-guests.service      loaded failed failed Suspend/Resume Running libvirt Guests
● libvirtd.service            loaded failed failed Virtualization daemon
● nvidia-persistenced.service loaded failed failed NVIDIA Persistence Daemon
● sqwebmail.service           masked failed failed sqwebmail.service
● tpm2-abrmd.service          loaded failed failed TPM2 Access Broker and Resource Management Daemon
● wd_keepalive.service        loaded failed failed LSB: Start watchdog keepalive daemon

LOAD   = Reflects whether the unit definition was properly loaded.
ACTIVE = The high-level unit activation state, i.e. generalization of SUB.
SUB    = The low-level unit activation state, values depend on unit type.
7 loaded units listed.

 

root@jeremiah:/etc/apt/sources.list.d#  systemctl list-units –failed
  UNIT                        LOAD   ACTIVE SUB    DESCRIPTION
● haproxy.service             loaded failed failed HAProxy Load Balancer
● libvirt-guests.service      loaded failed failed Suspend/Resume Running libvirt Guests
● libvirtd.service            loaded failed failed Virtualization daemon
● nvidia-persistenced.service loaded failed failed NVIDIA Persistence Daemon
● sqwebmail.service           masked failed failed sqwebmail.service
● tpm2-abrmd.service          loaded failed failed TPM2 Access Broker and Resource Management Daemon
● wd_keepalive.service        loaded failed failed LSB: Start watchdog keepalive daemon


To get a full list of objects of systemctl you can pass as state:
 

# systemctl –state=help
Full list of possible load states to pass is here
Show service properties


Check whether a service is failed or has other status and check default set systemd variables for it.

root@jeremiah~:# systemctl is-failed vboxweb.service
inactive

# systemctl show haproxy
Type=notify
Restart=always
NotifyAccess=main
RestartUSec=100ms
TimeoutStartUSec=1min 30s
TimeoutStopUSec=1min 30s
TimeoutAbortUSec=1min 30s
TimeoutStartFailureMode=terminate
TimeoutStopFailureMode=terminate
RuntimeMaxUSec=infinity
WatchdogUSec=0
WatchdogTimestampMonotonic=0
RootDirectoryStartOnly=no
RemainAfterExit=no
GuessMainPID=yes
SuccessExitStatus=143
MainPID=304858
ControlPID=0
FileDescriptorStoreMax=0
NFileDescriptorStore=0
StatusErrno=0
Result=success
ReloadResult=success
CleanResult=success

Full output of the above command is dumped in show_systemctl_properties.txt


3. List all running systemd services for a better overview on what's going on on machine
 

To get a list of all properly systemd loaded services you can use –state running.

hipo@jeremiah:~$ systemctl list-units –state running|head -n 10
  UNIT                              LOAD   ACTIVE SUB     DESCRIPTION
  proc-sys-fs-binfmt_misc.automount loaded active running Arbitrary Executable File Formats File System Automount Point
  cups.path                         loaded active running CUPS Scheduler
  init.scope                        loaded active running System and Service Manager
  session-2.scope                   loaded active running Session 2 of user hipo
  accounts-daemon.service           loaded active running Accounts Service
  anydesk.service                   loaded active running AnyDesk
  apache-htcacheclean.service       loaded active running Disk Cache Cleaning Daemon for Apache HTTP Server
  apache2.service                   loaded active running The Apache HTTP Server
  avahi-daemon.service              loaded active running Avahi mDNS/DNS-SD Stack

 

It is useful thing is to list all unit-files configured in systemd and their state, you can do it with:

 


root@pcfreak:~# systemctl list-unit-files
UNIT FILE                                                                 STATE           VENDOR PRESET
proc-sys-fs-binfmt_misc.automount                                         static          –            
-.mount                                                                   generated       –            
backups.mount                                                             generated       –            
dev-hugepages.mount                                                       static          –            
dev-mqueue.mount                                                          static          –            
media-cdrom0.mount                                                        generated       –            
mnt-sda1.mount                                                            generated       –            
proc-fs-nfsd.mount                                                        static          –            
proc-sys-fs-binfmt_misc.mount                                             disabled        disabled     
run-rpc_pipefs.mount                                                      static          –            
sys-fs-fuse-connections.mount                                             static          –            
sys-kernel-config.mount                                                   static          –            
sys-kernel-debug.mount                                                    static          –            
sys-kernel-tracing.mount                                                  static          –            
var-www.mount                                                             generated       –            
acpid.path                                                                masked          enabled      
cups.path                                                                 enabled         enabled      

 

 


root@pcfreak:~# systemctl list-units –type service –all
  UNIT                                   LOAD      ACTIVE   SUB     DESCRIPTION
  accounts-daemon.service                loaded    inactive dead    Accounts Service
  acct.service                           loaded    active   exited  Kernel process accounting
● alsa-restore.service                   not-found inactive dead    alsa-restore.service
● alsa-state.service                     not-found inactive dead    alsa-state.service
  apache2.service                        loaded    active   running The Apache HTTP Server
● apparmor.service                       not-found inactive dead    apparmor.service
  apt-daily-upgrade.service              loaded    inactive dead    Daily apt upgrade and clean activities
 apt-daily.service                      loaded    inactive dead    Daily apt download activities
  atd.service                            loaded    active   running Deferred execution scheduler
  auditd.service                         loaded    active   running Security Auditing Service
  auth-rpcgss-module.service             loaded    inactive dead    Kernel Module supporting RPCSEC_GSS
  avahi-daemon.service                   loaded    active   running Avahi mDNS/DNS-SD Stack
  certbot.service                        loaded    inactive dead    Certbot
  clamav-daemon.service                  loaded    active   running Clam AntiVirus userspace daemon
  clamav-freshclam.service               loaded    active   running ClamAV virus database updater
..

 


linux-systemd-components-diagram-linux-kernel-system-targets-systemd-libraries-daemons

 

4. Finding out more on why a systemd configured service has failed


Usually getting info about failed systemd service is done with systemctl status servicename.service
However, in case of troubles with service unable to start to get more info about why a service has failed with (-l) or (–full) options


root@pcfreak:~# systemctl -l status logrotate.service
● logrotate.service – Rotate log files
     Loaded: loaded (/lib/systemd/system/logrotate.service; static)
     Active: failed (Result: exit-code) since Fri 2022-02-25 00:00:06 EET; 13h ago
TriggeredBy: ● logrotate.timer
       Docs: man:logrotate(8)
             man:logrotate.conf(5)
    Process: 2045320 ExecStart=/usr/sbin/logrotate /etc/logrotate.conf (code=exited, status=1/FAILURE)
   Main PID: 2045320 (code=exited, status=1/FAILURE)
        CPU: 2.479s

Feb 25 00:00:06 pcfreak logrotate[2045577]: 2022/02/25 00:00:06| WARNING: For now we will assume you meant to write /32
Feb 25 00:00:06 pcfreak logrotate[2045577]: 2022/02/25 00:00:06| ERROR: '0.0.0.0/0.0.0.0' needs to be replaced by the term 'all'.
Feb 25 00:00:06 pcfreak logrotate[2045577]: 2022/02/25 00:00:06| SECURITY NOTICE: Overriding config setting. Using 'all' instead.
Feb 25 00:00:06 pcfreak logrotate[2045577]: 2022/02/25 00:00:06| WARNING: (B) '::/0' is a subnetwork of (A) '::/0'
Feb 25 00:00:06 pcfreak logrotate[2045577]: 2022/02/25 00:00:06| WARNING: because of this '::/0' is ignored to keep splay tree searching predictable
Feb 25 00:00:06 pcfreak logrotate[2045577]: 2022/02/25 00:00:06| WARNING: You should probably remove '::/0' from the ACL named 'all'
Feb 25 00:00:06 pcfreak systemd[1]: logrotate.service: Main process exited, code=exited, status=1/FAILURE
Feb 25 00:00:06 pcfreak systemd[1]: logrotate.service: Failed with result 'exit-code'.
Feb 25 00:00:06 pcfreak systemd[1]: Failed to start Rotate log files.
Feb 25 00:00:06 pcfreak systemd[1]: logrotate.service: Consumed 2.479s CPU time.


systemctl -l however is providing only the last log from message a started / stopped or whatever status service has generated. Sometimes systemctl -l servicename.service is showing incomplete the splitted error message as there is a limitation of line numbers on the console, see below

 

root@pcfreak:~# systemctl status -l certbot.service
● certbot.service – Certbot
     Loaded: loaded (/lib/systemd/system/certbot.service; static)
     Active: failed (Result: exit-code) since Fri 2022-02-25 09:28:33 EET; 4h 0min ago
TriggeredBy: ● certbot.timer
       Docs: file:///usr/share/doc/python-certbot-doc/html/index.html
             https://certbot.eff.org/docs
    Process: 290017 ExecStart=/usr/bin/certbot -q renew (code=exited, status=1/FAILURE)
   Main PID: 290017 (code=exited, status=1/FAILURE)
        CPU: 9.771s

Feb 25 09:28:33 pcfrxen certbot[290017]: The error was: PluginError('An authentication script must be provided with –manual-auth-hook when using th>
Feb 25 09:28:33 pcfrxen certbot[290017]: All renewals failed. The following certificates could not be renewed:
Feb 25 09:28:33 pcfrxen certbot[290017]:   /etc/letsencrypt/live/mail.pcfreak.org-0003/fullchain.pem (failure)
Feb 25 09:28:33 pcfrxen certbot[290017]:   /etc/letsencrypt/live/www.eforia.bg-0005/fullchain.pem (failure)
Feb 25 09:28:33 pcfrxen certbot[290017]:   /etc/letsencrypt/live/zabbix.pc-freak.net/fullchain.pem (failure)
Feb 25 09:28:33 pcfrxen certbot[290017]: 3 renew failure(s), 5 parse failure(s)
Feb 25 09:28:33 pcfrxen systemd[1]: certbot.service: Main process exited, code=exited, status=1/FAILURE
Feb 25 09:28:33 pcfrxen systemd[1]: certbot.service: Failed with result 'exit-code'.
Feb 25 09:28:33 pcfrxen systemd[1]: Failed to start Certbot.
Feb 25 09:28:33 pcfrxen systemd[1]: certbot.service: Consumed 9.771s CPU time.

 

5. Get a complete log of journal to make sure everything configured on server host runs as it should

Thus to get more complete list of the message and be able to later google and look if has come with a solution on the internet  use:

root@pcfrxen:~#  journalctl –catalog –unit=certbot

— Journal begins at Sat 2022-01-22 21:14:05 EET, ends at Fri 2022-02-25 13:32:01 EET. —
Jan 23 09:58:18 pcfrxen systemd[1]: Starting Certbot…
░░ Subject: A start job for unit certbot.service has begun execution
░░ Defined-By: systemd
░░ Support: https://www.debian.org/support
░░ 
░░ A start job for unit certbot.service has begun execution.
░░ 
░░ The job identifier is 5754.
Jan 23 09:58:20 pcfrxen certbot[124996]: Traceback (most recent call last):
Jan 23 09:58:20 pcfrxen certbot[124996]:   File "/usr/lib/python3/dist-packages/certbot/_internal/renewal.py", line 71, in _reconstitute
Jan 23 09:58:20 pcfrxen certbot[124996]:     renewal_candidate = storage.RenewableCert(full_path, config)
Jan 23 09:58:20 pcfrxen certbot[124996]:   File "/usr/lib/python3/dist-packages/certbot/_internal/storage.py", line 471, in __init__
Jan 23 09:58:20 pcfrxen certbot[124996]:     self._check_symlinks()
Jan 23 09:58:20 pcfrxen certbot[124996]:   File "/usr/lib/python3/dist-packages/certbot/_internal/storage.py", line 537, in _check_symlinks

root@server:~# journalctl –catalog –unit=certbot|grep -i pluginerror|tail -1
Feb 25 09:28:33 pcfrxen certbot[290017]: The error was: PluginError('An authentication script must be provided with –manual-auth-hook when using the manual plugin non-interactively.')


Or if you want to list and read only the last messages in the journal log regarding a service

root@server:~# journalctl –catalog –pager-end –unit=certbot


If you have disabled a failed service because you don't need it to run at all on the machine with:

root@rhel:~# systemctl stop rngd.service
root@rhel:~# systemctl disable rngd.service

And you want to clear up any failed service information that is kept in the systemctl service log you can do it with:
 

root@rhel:~# systemctl reset-failed

Another useful systemctl option is cat, you can use it to easily list a service it is useful to quickly check what is a service, an actual shortcut to save you from giving a full path to the service e.g. cat /lib/systemd/system/certbot.service

root@server:~# systemctl cat certbot
# /lib/systemd/system/certbot.service
[Unit]
Description=Certbot
Documentation=file:///usr/share/doc/python-certbot-doc/html/index.html
Documentation=https://certbot.eff.org/docs
[Service]
Type=oneshot
ExecStart=/usr/bin/certbot -q renew
PrivateTmp=true


After failed SystemD services are fixed, it is best to reboot the machine and check put some more time to inspect rawly the complete journal log to make sure, no error  was left behind.


Closure
 

As you can see updating a machine from a major to a major version even if you follow the official documentation and you have plenty of experience is always more or a less a pain in the ass, which can eat up much of your time banging your head solving problems with failed daemons issues with /etc/rc.local (which I have faced becase of #/bin/sh -e (which would make /etc/rc.local) to immediately quit if any error from command $? returns different from 0 etc.. The  logical questions comes then;
1. Is it really worthy to update at all regularly, especially if you don't know of a famous major Vulnerability 🙂 ?
2. Or is it worthy to update from OS major release to OS major release at all?  
3. Or should you only try to patch the service that is exposed to an external reachable computer network or the internet only and still the the same OS release until End of Life (LTS = Long Term Support) as called in Debian or  End Of Life  (EOL) Cycle as called in RPM based distros the period until the OS major release your software distro has official security patches is reached.

Anyone could take any approach but for my own managed systems small network at home my practice was always to try to keep up2date everything every 3 or 6 months maximum. This has caused me multiple days of irritation and stress and perhaps many white hairs and spend nerves on shit.


4. Based on the company where I'm employed the better strategy is to patch to the EOL is still offered and keep the rule First Things First (FTF), once the EOL is reached, just make a copy of all servers data and configuration to external Data storage, bring up a new Physical or VM and migrate the services.
Test after the migration all works as expected if all is as it should be change the DNS records or Leading Infrastructure Proxies whatever to point to the new service and that's it! Yes it is true that migration based on a full OS reinstall is more time consuming and requires much more planning, but usually the result is much more expected, plus it is much less stressful for the guy doing the job.

How to configure haproxy logging to separate file on Redhat Enterprise Linux 8.5 Ootpa

Thursday, February 3rd, 2022

haproxy-rsyslog-architecture-logging-picture

Configuring proper logging for haproxy is always a pain in the ass in Linux, because of rsyslogd various config syntax among versions, because of bugs in OS etc. 
Today we have been given 2 Redhat 8.5 Linux servers where we had a task to start configuring haproxies, to have an idea on what is going on of course we had to enable proper haproxy logging in separate log file under separate local, for the test one can use haproxy's 

log /dev/log local6

config, this is a general way to configure logging which I've described earlier in the article How to enable haproxy logging to a separate log /var/log/haproxy.log / prevent duplicate messages to appear in /var/log/messages
However this time I wanted to not use /dev/log as this device is also used by systemd / journald and theoretically could be used by other services and there might be multiple services logging to the same places possibly leading to some issue, thus I wanted to send and process the haproxy messages directly from rsyslog on RHEL 8.5.

Create a custom file that is loaded with the rest of configuration from /etc/rsyslog.conf with a line like:
 

# Include all config files in /etc/rsyslog.d/
include(file="/etc/rsyslog.d/*.conf" mode="optional")


Create 49_haproxy.conf with below content

[root@haproxy: ~]# vim /etc/rsyslog.d/49_haproxy.conf

$ModLoad imudp
$UDPServerAddress 127.0.0.1
$UDPServerRun 514
#2022/02/02: HAProxy logs to local6, save the messages
local6.*                                                /var/log/haproxy.log
if ($programname == 'haproxy') then -/var/log/haproxy.log
& stop

touch /var/log/haproxy.log
chown haproxy:haproxy /var/log/haproxy.log

In /etc/haproxy/haproxy.cfg under global section to print in verbose mode messages (i.e. check, the haproxy is receiving properly sent traffic) do configure something like:

 

global
  log          127.0.0.1 local6 debug


Eventually you might want to remove the debug word out of the config, if you don't want to log too much verbosily once everything is properly tested and configured

[root@haproxy: ~]# curl -v -c -k 10.10.192.135:16010
* Rebuilt URL to: 10.10.192.135:15010/
*   Trying 10.10.192.135…
* TCP_NODELAY set
* Connected to 10.10.192.135 (10.10.192.135) port 15010 (#0)
> GET / HTTP/1.1
> Host: 10.10.192.135:15010
> User-Agent: curl/7.61.1
> Accept: */*

* Empty reply from server
* Connection #0 to host 10.10.192.135 left intact
curl: (52) Empty reply from server

 

In /var/log/haproxy.log you should get some messages like:
 

Feb  3 14:16:44 localhost.localdomain haproxy[25029]: proxy IN_Traffic_Bak has no server available!
Feb  3 14:16:44 localhost.localdomain haproxy[25029]: proxy IN_Traffic_Bak has no server available!
Feb  3 15:59:50 localhost.localdomain haproxy[25029]: [03/Feb/2022:15:59:50.162] 10.44.192.135:1348 -:- IN_Traffic/<NOSRV>:- -1/-1/0 0 SC 1/1/0/0/0 0/0
Feb  3 15:59:50 localhost.localdomain haproxy[25029]: [03/Feb/2022:15:59:50.162] 10.44.192.135:1348 -:- IN_Traffic/<NOSRV>:- -1/-1/0 0 SC 1/1/0/0/0 0/0
Feb  3 15:59:50 localhost.localdomain haproxy[25029]: [03/Feb/2022:15:59:50.162] 10.44.192.135:1348 -:- IN_Traffic/<NOSRV>:- -1/-1/0 0 SC 1/1/0/0/0 0/0

 

Log rsyslog script incoming tagged string message to separate external file to prevent /var/log/message from string flood

Wednesday, December 22nd, 2021

rsyslog_logo-log-external-tag-scripped-messages-to-external-file-linux-howto

If you're using some external bash script to log messages via rsyslogd to some of the multiple rsyslog understood data tubes (called in rsyslog language facility levels) and you want Rsyslog to move message string to external log file, then you had the same task as me few days ago.

For example you have a bash shell script that is writting a message to rsyslog daemon to some of the predefined facility levels be it:
 

kern,user,cron, auth etc. or some local

and your logged script data ends under the wrong file location /var/log/messages , /var/log/secure , var/log/cron etc. However  you need to log everything coming from that service to a separate file based on the localX (fac. level) the usual way to do it is via some config like, as you would usually do it with rsyslog variables as:
 

local1.info                                            /var/log/custom-log.log

# Don't log private authentication messages!
*.info;mail.none;authpriv.none;cron.none;local0.none;local1.none        /var/log/messages


Note the local1.none is instructing the rsyslog not to log anything from local1 facility towards /var/log/message. 
But what if this due to some weirdness in configuration of rsyslog on the server or even due to some weird misconfiguration in

/etc/systemd/journald.conf such as:

[Journal]
Storage=persistent
RateLimitInterval=0s
RateLimitBurst=0
SystemMaxUse=128M
SystemMaxFileSize=32M
MaxRetentionSec=1month
MaxFileSec=1week
ForwardToSyslog=yes
SplitFiles=none

Due to that config and especially the FowardToSyslog=yes, the messages sent via the logger tool to local1 still end up inside /var/log/messages, not nice huh ..

The result out of that is anything being sent with a predefined TAGGED string via the whatever.sh script which uses the logger command  (if you never use it check man logger) to enter message into rsyslog with cmd like:
 

# logger -p local1.info -t TAG_STRING

# logger -p local2.warn test
# tail -2 /var/log/messages
Dec 22 18:58:23 pcfreak rsyslogd: — MARK —
Dec 22 19:07:12 pcfreak hipo: test


was nevertheless logged to /var/log/message.
Of course /var/log/message becomes so overfilled with "junk" shell script data not related to real basic Operating system adminsitration, so this prevented any critical or important messages that usually should come under /var/log/message / /var/log/syslog to be lost among the big quantities of other tagged tata reaching the log.

After many attempts to resolve the issue by modifying /etc/rsyslog.conf as well as the messed /etc/systemd/journald.conf (which by the way was generated with this strange values with an OS install time automation ansible stuff). It took me a while until I found the solution on how to tell rsyslog to log the tagged message strings into an external separate file. From my 20 minutes of research online I have seen multitudes of people in different Linux OS versions to experience the same or similar issues due to whatever, thus this triggered me to write this small article on the solution to rsyslog.

The solution turned to be pretty easy but requires some further digging into rsyslog, Redhat's basic configuration on rsyslog documentation is a very nice reading for starters, in my case I've used one of the Propery-based compare-operations variable contains used to select my tagged message string.
 

1. Add msg contains compare-operations to output log file and discard the messages

[root@centos bin]# vi /etc/rsyslog.conf

# config to log everything logged to rsyslog to a separate file
:msg, contains, "tag_string:/"         /var/log/custom-script-log.log
:msg, contains, "tag_string:/"    ~

Substitute quoted tag_string:/ to whatever your tag is and mind that it is better this config is better to be placed somewhere near the beginning of /etc/rsyslog.conf and touch the file /var/log/custom-script-log.log and give it some decent permissions such as 755, i.e.
 

1.1 Discarding a message


The tilda sign –  

as placed to the end of the msg, contains is the actual one to tell the string to be discarded so it did not end in /var/log/messages.

Alternative rsyslog config to do discard the unwanted message once you have it logged is with the
rawmsg variable, like so:

 

# config to log everything logged to rsyslog to a separate file
:msg, contains, "tag_string:/"         /var/log/custom-script-log.log
:rawmsg, isequal, "tag_string:/" stop

Other way to stop logging immediately after log is written to custom file across some older versions of rsyslog is via the &stop

:msg, contains, "tag_string:/"         /var/log/custom-script-log.log
& stop

I don't know about other versions but Unfortunately the &stop does not work on RHEL 7.9 with installed rpm package rsyslog-8.24.0-57.el7_9.1.x86_64.

1.2 More with property based filters basic exclusion of string 

Property based filters can do much more, you can for example, do regular expression based matches of strings coming to rsyslog and forward to somewhere.

To select syslog messages which do not contain any mention of the words fatal and error with any or no text between them (for example, fatal lib error), type:

:msg, !regex, "fatal .* error"

 

2. Create file where tagged data should be logged and set proper permissions
 

[root@centos bin]# touch /var/log/custom-script-log.log
[root@centos bin]# chmod 755 /var/log/custom-script-log.log


3. Test rsyslogd configuration for errors and reload rsyslog

[root@centos ]# rsyslogd -N1
rsyslogd: version 8.24.0-57.el7_9.1, config validation run (level 1), master config /etc/rsyslog.conf
rsyslogd: End of config validation run. Bye.

[root@centos ]# systemctl restart rsyslog
[root@centos ]#  systemctl status rsyslog 
● rsyslog.service – System Logging Service
   Loaded: loaded (/usr/lib/systemd/system/rsyslog.service; enabled; vendor preset: enabled)
   Active: active (running) since Wed 2021-12-22 13:40:11 CET; 3h 5min ago
     Docs: man:rsyslogd(8)
           http://www.rsyslog.com/doc/
 Main PID: 108600 (rsyslogd)
   CGroup: /system.slice/rsyslog.service
           └─108600 /usr/sbin/rsyslogd -n

 

4. Property-based compare-operations supported by rsyslog table
 

Compare-operation Description
contains Checks whether the provided string matches any part of the text provided by the property. To perform case-insensitive comparisons, use  contains_i .
isequal Compares the provided string against all of the text provided by the property. These two values must be exactly equal to match.
startswith Checks whether the provided string is found exactly at the beginning of the text provided by the property. To perform case-insensitive comparisons, use  startswith_i .
regex Compares the provided POSIX BRE (Basic Regular Expression) against the text provided by the property.
ereregex Compares the provided POSIX ERE (Extended Regular Expression) regular expression against the text provided by the property.
isempty Checks if the property is empty. The value is discarded. This is especially useful when working with normalized data, where some fields may be populated based on normalization result.

 


5. Rsyslog understanding Facility levels

Here is a list of facility levels that can be used.

Note: The mapping between Facility Number and Keyword is not uniform over different operating systems and different syslog implementations, so among separate Linuxes there might be diference in the naming and numbering.

Facility Number Keyword Facility Description
0 kern kernel messages
1 user user-level messages
2 mail mail system
3 daemon system daemons
4 auth security/authorization messages
5 syslog messages generated internally by syslogd
6 lpr line printer subsystem
7 news network news subsystem
8 uucp UUCP subsystem
9   clock daemon
10 authpriv security/authorization messages
11 ftp FTP daemon
12 NTP subsystem
13 log audit
14 log alert
15 cron clock daemon
16 local0 local use 0 (local0)
17 local1 local use 1 (local1)
18 local2 local use 2 (local2)
19 local3 local use 3 (local3)
20 local4 local use 4 (local4)
21 local5 local use 5 (local5)
22 local6 local use 6 (local6)
23 local7 local use 7 (local7)


6. rsyslog Severity levels (sublevels) accepted by facility level

As defined in RFC 5424, there are eight severity levels as of year 2021:

Code Severity Keyword Description General Description
0 Emergency emerg (panic) System is unusable. A "panic" condition usually affecting multiple apps/servers/sites. At this level it would usually notify all tech staff on call.
1 Alert alert Action must be taken immediately. Should be corrected immediately, therefore notify staff who can fix the problem. An example would be the loss of a primary ISP connection.
2 Critical crit Critical conditions. Should be corrected immediately, but indicates failure in a primary system, an example is a loss of a backup ISP connection.
3 Error err (error) Error conditions. Non-urgent failures, these should be relayed to developers or admins; each item must be resolved within a given time.
4 Warning warning (warn) Warning conditions. Warning messages, not an error, but indication that an error will occur if action is not taken, e.g. file system 85% full – each item must be resolved within a given time.
5 Notice notice Normal but significant condition. Events that are unusual but not error conditions – might be summarized in an email to developers or admins to spot potential problems – no immediate action required.
6 Informational info Informational messages. Normal operational messages – may be harvested for reporting, measuring throughput, etc. – no action required.
7 Debug debug Debug-level messages. Info useful to developers for debugging the application, not useful during operations.


7. Sample well tuned configuration using severity and facility levels and immark, imuxsock, impstats
 

Below is sample config using severity and facility levels
 

# Don't log private authentication messages!
*.info;mail.none;authpriv.none;cron.none;local0.none;local1.none        /var/log/messages


Note the local0.none; local1.none tells rsyslog to not log from that facility level to /var/log/messages.

If you need a complete set of rsyslog configuration fine tuned to have a proper logging with increased queues and included configuration for loggint to remote log aggegator service as well as other measures to prevent the system disk from being filled in case if something goes wild with a logging service leading to a repeatedly messages you might always contact me and I can help 🙂
 Other from that sysadmins might benefit from a sample set of configuration prepared with the Automated rsyslog config builder  or use some fine tuned config  for rsyslog-8.24.0-57.el7_9.1.x86_64 on Redhat 7.9 (Maipo)   rsyslog_config_redhat-2021.tar.gz.

To sum it up rsyslog though looks simple and not an important thing to pre

Stop haproxy log requests to /var/log/messages / Disable haproxy double logging

Friday, June 25th, 2021

haproxy-logo

On a CentOS Linux release 7.9.2009 (Core) I've running haproxies on two KVM virtual machines that are configured in a High Avaialability cluster with Corosync and Pacemaker, the machines are inherited from another admin (I did not install the servers hardware) and OS but have been received the system for support.
The old sysadmins seems to not care much about the system so they've left the haprxoy with Double logging one time under separate configured log in /var/log/haproxy/haproxyprod.log and each Haproxy TCP mode flown request has been double logged to /var/log/messages as well. As you can guess this shouldn't be so because we're wasting Hard drive space so to fix that I had to stop haproxy doble logging to /var/log/messages.

The logging is done under a separate local pointer local6 the /etc/haproxy/haproxyprod.cfg goes as follows:
 

[root@haproxy01 ~]# cat /etc/haproxy/haproxyprod.cfg

global
    # log <address> [len ] [max level [min level]]
    log 127.0.0.1 local6 debug

 

The logging is handled by rsyslog via the local6, so obviously to keep out the logging from /var/log/messages
The logging to the separate log file configuration in rsyslog is as follows:

local6.*                                                /var/log/haproxy/haproxyprod.log

It turned to be really easy to prevent haproxy get its requests log to /var/log/messages all I had to change is under /etc/rsyslogd.conf

local6.none config has to be placed for /var/log/messages the full line configuration in /etc/rsyslog.conf that stopped double logging is:

# Don't log private authentication messages!
*.info;mail.none;authpriv.none;cron.none;local5.none;local6.none                /var/log/messages

 

How to check Linux server power supply state is Okay / How to find out a Linux Power Supply is broken

Wednesday, January 6th, 2021

2U-power-supplies-get-status-if-Power-supply-broken-information-linux-ipmitool

If you're a sysadmin and managing remotely Linux servers, every now and then if a machine is hanging without a reason it useful to check the server Power Supply state. I say that because often if the machine is mysteriously hanging and a standard Root Cause Analysis (RCA) on /var/log/messages /var/log/dmesg /var/log/boot etc. did not bring you to any different conclusion. The next step after you send a technician to reboot the machine is to check on Linux OS level whether Power Supply Unit (PSU) hardware on the machine does not have some issues.
As blogged earlier on how to use ipmitool to manage remote ILO remote boards etc. the ipmitool can also be used to check status of Server PSUs.

Below is example output of 2 PSU server whose Power Supplies are functioning normally.
 

[root@linux-server ~]# ipmitool sdr type "Power Supply"

PS Heavy Load    | 2Bh | ok  | 19.1 | State Deasserted
Power Supply 1   | 70h | ok  | 10.1 | Presence detected
Power Supply 2   | 71h | ok  | 10.2 | Presence detected
PS Configuration | 72h | ok  | 19.1 |
PS 1 Therm Fault | 75h | ok  | 10.1 | Transition to OK
PS 2 Therm Fault | 76h | ok  | 10.2 | Transition to OK
PS1 12V OV Fault | 77h | ok  | 10.1 | Transition to OK
PS2 12V OV Fault | 78h | ok  | 10.2 | Transition to OK
PS1 12V UV Fault | 79h | ok  | 10.1 | Transition to OK
PS2 12V UV Fault | 7Ah | ok  | 10.2 | Transition to OK
PS1 12V OC Fault | 7Bh | ok  | 10.1 | Transition to OK
PS2 12V OC Fault | 7Ch | ok  | 10.2 | Transition to OK
PS1 12Vaux Fault | 7Dh | ok  | 10.1 | Transition to OK
PS2 12Vaux Fault | 7Eh | ok  | 10.2 | Transition to OK
Power Unit       | 7Fh | ok  | 19.1 | Fully Redundant

Now if you have a server lets say on an old ProLiant DL360e Gen8 whose Power Supply is damaged, you will get an from ipmitool similar to:

[root@linux-server  systemd]# ipmitool sdr type "Power Supply"
Power Supply 1   | 30h | ok  | 10.1 | 100 Watts, Presence detected
Power Supply 2   | 31h | ok  | 10.2 | 0 Watts, Presence detected, Failure detected, Power Supply AC lost
Power Supplies   | 33h | ok  | 10.3 | Redundancy Lost


If you don't have ipmitool installed due to security or whatever but you have the hardware detection software dmidecode you can use it too to get the Power Supply state

[root@linux-server  systemd]# dmidecode -t chassis
# dmidecode 3.2
Getting SMBIOS data from sysfs.
SMBIOS 2.8 present.

 

Handle 0x0300, DMI type 3, 21 bytes
Chassis Information
        Manufacturer: HP
        Type: Rack Mount Chassis
        Lock: Not Present
        Version: Not Specified
        Serial Number: CZJ38201ZH
        Asset Tag:
        Boot-up State: Critical
        Power Supply State: Critical

        Thermal State: Safe
        Security Status: Unknown
        OEM Information: 0x00000000
        Height: 1 U
        Number Of Power Cords: 2
        Contained Elements: 0

To find only Power Supply info status on a server with dmideode.

# dmidecode –type 39

monitoring-power-supply-hardware-information-linux-ipmitool

Plug between the power supply and the mainboard voltage / coms ATX specification

This can also be used on a normal Linux desktop PCs which usually have only 1U (one power supply) on many of Ubuntus and Linux desktops where lshw (list hardaware information) is installed to get the machine PSUs status with lshw 

 root@ubuntu:~# lshw -c power
  *-battery               
       product: 45N1111
       vendor: SONY
       physical id: 1
       slot: Front
       capacity: 23200mWh
       configuration: voltage=11.1V
        Thermal State: Safe
        Security Status: Unknown
        OEM Information: 0x00000000
        Height: 1 U
        Number Of Power Cords: 2
        Contained Elements: 0


Finally to get an extensive information on the voltages of the Power Supply you can use the good old lm_sensors.

# apt-get install lm-sensors
# sensors-detect 
# service kmod start

# sensors
# watch sensors


As manually monitoring Power Supplies and other various data is dubious, finally you might want to use some centralized monitoring. For one example on that you might want to check my prior Zabbix to Monitor Hardware Hard Drive / Temperature and Disk with lm_sensors / smartd on Linux with Zabbix.

How to enable HaProxy logging to a separate log /var/log/haproxy.log / prevent HAProxy duplicate messages to appear in /var/log/messages

Wednesday, February 19th, 2020

haproxy-logging-basics-how-to-log-to-separate-file-prevent-duplicate-messages-haproxy-haproxy-weblogo-squares
haproxy  logging can be managed in different form the most straight forward way is to directly use /dev/log either you can configure it to use some log management service as syslog or rsyslogd for that.

If you don't use rsyslog yet to install it: 

# apt install -y rsyslog

Then to activate logging via rsyslogd we can should add either to /etc/rsyslogd.conf or create a separte file and include it via /etc/rsyslogd.conf with following content:
 

Enable haproxy logging from rsyslogd


Log haproxy messages to separate log file you can use some of the usual syslog local0 to local7 locally used descriptors inside the conf (be aware that if you try to use some wrong value like local8, local9 as a logging facility you will get with empty haproxy.log, even though the permissions of /var/log/haproxy.log are readable and owned by haproxy user.

When logging to a local Syslog service, writing to a UNIX socket can be faster than targeting the TCP loopback address. Generally, on Linux systems, a UNIX socket listening for Syslog messages is available at /dev/log because this is where the syslog() function of the GNU C library is sending messages by default. To address UNIX socket in haproxy.cfg use:

log /dev/log local2 


If you want to log into separate log each of multiple running haproxy instances with different haproxy*.cfg add to /etc/rsyslog.conf lines like:

local2.* -/var/log/haproxylog2.log
local3.* -/var/log/haproxylog3.log


One important note to make here is since rsyslogd is used for haproxy logging you need to have enabled in rsyslogd imudp and have a UDP port listener on the machine.

E.g. somewhere in rsyslog.conf or via rsyslog include file from /etc/rsyslog.d/*.conf needs to have defined following lines:

$ModLoad imudp
$UDPServerRun 514


I prefer to use external /etc/rsyslog.d/20-haproxy.conf include file that is loaded and enabled rsyslogd via /etc/rsyslog.conf:

# vim /etc/rsyslog.d/20-haproxy.conf

$ModLoad imudp
$UDPServerRun 514​
local2.* -/var/log/haproxy2.log


It is also possible to produce different haproxy log output based on the severiy to differentiate between important and less important messages, to do so you'll need to rsyslog.conf something like:
 

# Creating separate log files based on the severity
local0.* /var/log/haproxy-traffic.log
local0.notice /var/log/haproxy-admin.log

 

Prevent Haproxy duplicate messages to appear in /var/log/messages

If you use local2 and some default rsyslog configuration then you will end up with the messages coming from haproxy towards local2 facility producing doubled simultaneous records to both your pre-defined /var/log/haproxy.log and /var/log/messages on Proxy servers that receive few thousands of simultanous connections per second.
This is a problem since doubling the log will produce too much data and on systems with smaller /var/ partition you will quickly run out of space + this haproxy requests logging to /var/log/messages makes the file quite unreadable for normal system events which are so important to track clearly what is happening on the server daily.

To prevent the haproxy duplicate messages you need to define somewhere in rsyslogd usually /etc/rsyslog.conf local2.none near line of facilities configured to log to file:

*.info;mail.none;authpriv.none;cron.none;local2.none     /var/log/messages

This configuration should work but is more rarely used as most people prefer to have haproxy log being written not directly to /dev/log which is used by other services such as syslogd / rsyslogd.

To use /dev/log to output logs from haproxy configuration in global section use config like:
 

global
        log /dev/log local2 debug
        chroot /var/lib/haproxy
        stats socket /run/haproxy/admin.sock mode 660 level admin
        stats timeout 30s
        user haproxy
        group haproxy
        daemon

The log global directive basically says, use the log line that was set in the global section for whole config till end of file. Putting a log global directive into the defaults section is equivalent to putting it into all of the subsequent proxy sections.

Using global logging rules is the most common HAProxy setup, but you can put them directly into a frontend section instead. It can be useful to have a different logging configuration as a one-off. For example, you might want to point to a different target Syslog server, use a different logging facility, or capture different severity levels depending on the use case of the backend application. 

Insetad of using /dev/log interface that is on many distributions heavily used by systemd to store / manage and distribute logs,  many haproxy server sysadmins nowdays prefer to use rsyslogd as a default logging facility that will manage haproxy logs.
Admins prefer to use some kind of mediator service to manage log writting such as rsyslogd or syslog, the reason behind might vary but perhaps most important reason is  by using rsyslogd it is possible to write logs simultaneously locally on disk and also forward logs  to a remote Logging server  running rsyslogd service.

Logging is defined in /etc/haproxy/haproxy.cfg or the respective configuration through global section but could be also configured to do a separate logging based on each of the defined Frontend Backends or default section. 
A sample exceprt from this section looks something like:

#———————————————————————
# Global settings
#———————————————————————
global
    log         127.0.0.1 local2

    chroot      /var/lib/haproxy
    pidfile     /var/run/haproxy.pid
    maxconn     4000
    user        haproxy
    group       haproxy
    daemon

    # turn on stats unix socket
    stats socket /var/lib/haproxy/stats

#———————————————————————
defaults
    mode                    tcp
    log                     global
    option                  tcplog
    #option                  dontlognull
    #option http-server-close
    #option forwardfor       except 127.0.0.0/8
    option                  redispatch
    retries                 7
    #timeout http-request    10s
    timeout queue           10m
    timeout connect         30s
    timeout client          20m
    timeout server          10m
    #timeout http-keep-alive 10s
    timeout check           30s
    maxconn                 3000

 

 

# HAProxy Monitoring Config
#———————————————————————
listen stats 192.168.0.5:8080                #Haproxy Monitoring run on port 8080
    mode http
    option httplog
    option http-server-close
    stats enable
    stats show-legends
    stats refresh 5s
    stats uri /stats                            #URL for HAProxy monitoring
    stats realm Haproxy\ Statistics
    stats auth hproxyauser:Password___          #User and Password for login to the monitoring dashboard

 

#———————————————————————
# frontend which proxys to the backends
#———————————————————————
frontend ft_DKV_PROD_WLPFO
    mode tcp
    bind 192.168.233.5:30000-31050
    option tcplog
    log-format %ci:%cp\ [%t]\ %ft\ %b/%s\ %Tw/%Tc/%Tt\ %B\ %ts\ %ac/%fc/%bc/%sc/%rc\ %sq/%bq
    default_backend Default_Bakend_Name


#———————————————————————
# round robin balancing between the various backends
#———————————————————————
backend bk_DKV_PROD_WLPFO
    mode tcp
    # (0) Load Balancing Method.
    balance source
    # (4) Peer Sync: a sticky session is a session maintained by persistence
    stick-table type ip size 1m peers hapeers expire 60m
    stick on src
    # (5) Server List
    # (5.1) Backend
    server Backend_Server1 10.10.10.1 check port 18088
    server Backend_Server2 10.10.10.2 check port 18088 backup


The log directive in above config instructs HAProxy to send logs to the Syslog server listening at 127.0.0.1:514. Messages are sent with facility local2, which is one of the standard, user-defined Syslog facilities. It’s also the facility that our rsyslog configuration is expecting. You can add more than one log statement to send output to multiple Syslog servers.

Once rsyslog and haproxy logging is configured as a minumum you need to restart rsyslog (assuming that haproxy config is already properly loaded):

# systemctl restart rsyslogd.service

To make sure rsyslog reloaded successfully:

systemctl status rsyslogd.service


Restarting HAproxy

If the rsyslogd logging to 127.0.0.1 port 514 was recently added a HAProxy restart should also be run, you can do it with:
 

# /usr/sbin/haproxy -f /etc/haproxy/haproxy.cfg -D -p /var/run/haproxy.pid -sf $(cat /var/run/haproxy.pid)


Or to restart use systemctl script (if haproxy is not used in a cluster with corosync / heartbeat).

# systemctl restart haproxy.service

You can control how much information is logged by adding a Syslog level by

    log         127.0.0.1 local2 info


The accepted values are the standard syslog security level severity:

Value Severity Keyword Deprecated keywords Description Condition
0 Emergency emerg panic System is unusable A panic condition.
1 Alert alert   Action must be taken immediately A condition that should be corrected immediately, such as a corrupted system database.
2 Critical crit   Critical conditions Hard device errors.
3 Error err error Error conditions  
4 Warning warning warn Warning conditions  
5 Notice notice   Normal but significant conditions Conditions that are not error conditions, but that may require special handling.
6 Informational info   Informational messages  
7 Debug debug   Debug-level messages Messages that contain information normally of use only when debugging a program.

 

Logging only errors / timeouts / retries and errors is done with option:

Note that if the rsyslog is configured to listen on different port for some weird reason you should not forget to set the proper listen port, e.g.:
 

  log         127.0.0.1:514 local2 info

option dontlog-normal

in defaults or frontend section.

You most likely want to enable this only during certain times, such as when performing benchmarking tests.

(or log-format-sd for structured-data syslog) directive in your defaults or frontend
 

Haproxy Logging shortly explained


The type of logging you’ll see is determined by the proxy mode that you set within HAProxy. HAProxy can operate either as a Layer 4 (TCP) proxy or as Layer 7 (HTTP) proxy. TCP mode is the default. In this mode, a full-duplex connection is established between clients and servers, and no layer 7 examination will be performed. When in TCP mode, which is set by adding mode tcp, you should also add option tcplog. With this option, the log format defaults to a structure that provides useful information like Layer 4 connection details, timers, byte count and so on.

Below is example of configured logging with some explanations:

Log-format "%ci:%cp [%t] %ft %b/%s %Tw/%Tc/%Tt %B %ts %ac/%fc/%bc/%sc/%rc %sq/%bq"

haproxy-logged-fields-explained
Example of Log-Format configuration as shown above outputted of haproxy config:

Log-format "%ci:%cp [%tr] %ft %b/%s %TR/%Tw/%Tc/%Tr/%Ta %ST %B %CC %CS %tsc %ac/%fc/%bc/%sc/%rc %sq/%bq %hr %hs %{+Q}r"

haproxy_http_log_format-explained1

To understand meaning of this abbreviations you'll have to closely read  haproxy-log-format.txt. More in depth info is to be found in HTTP Log format documentation


haproxy_logging-explained

Logging HTTP request headers

HTTP request header can be logged via:
 

 http-request capture

frontend website
    bind :80
    http-request capture req.hdr(Host) len 10
    http-request capture req.hdr(User-Agent) len 100
    default_backend webservers


The log will show headers between curly braces and separated by pipe symbols. Here you can see the Host and User-Agent headers for a request:

192.168.150.1:57190 [20/Dec/2018:22:20:00.899] website~ webservers/server1 0/0/1/0/1 200 462 – – —- 1/1/0/0/0 0/0 {mywebsite.com|Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Ubuntu Chromium/71.0.3578.80 } "GET / HTTP/1.1"

 

Haproxy Stats Monitoring Web interface


Haproxy is having a simplistic stats interface which if enabled produces some general useful information like in above screenshot, through which
you can get a very basic in browser statistics and track potential issues with the proxied traffic for all configured backends / frontends incoming outgoing
network packets configured nodes
 experienced downtimes etc.

haproxy-statistics-report-picture

The basic configuration to make the stats interface accessible would be like pointed in above config for example to enable network listener on address
 

https://192.168.0.5:8080/stats


with hproxyuser / password config would be:

# HAProxy Monitoring Config
#———————————————————————
listen stats 192.168.0.5:8080                #Haproxy Monitoring run on port 8080
    mode http
    option httplog
    option http-server-close
    stats enable
    stats show-legends
    stats refresh 5s
    stats uri /stats                            #URL for HAProxy monitoring
    stats realm Haproxy\ Statistics
    stats auth hproxyauser:Password___          #User and Password for login to the monitoring dashboard

 

 

Sessions states and disconnect errors on new application setup

Both TCP and HTTP logs include a termination state code that tells you the way in which the TCP or HTTP session ended. It’s a two-character code. The first character reports the first event that caused the session to terminate, while the second reports the TCP or HTTP session state when it was closed.

Here are some essential termination codes to track in for in the log:
 

Here are some termination code examples most commonly to see on TCP connection establishment errors:

Two-character code    Meaning
—    Normal termination on both sides.
cD    The client did not send nor acknowledge any data and eventually timeout client expired.
SC    The server explicitly refused the TCP connection.
PC    The proxy refused to establish a connection to the server because the process’ socket limit was reached while attempting to connect.


To get all non-properly exited codes the easiest way is to just grep for anything that is different from a termination code –, like that:

tail -f /var/log/haproxy.log | grep -v ' — '


This should output in real time every TCP connection that is exiting improperly.

There’s a wide variety of reasons a connection may have been closed. Detailed information about all possible termination codes can be found in the HAProxy documentation.
To get better understanding a very useful reading to haproxy Debug errors with  is in haproxy-logging.txt in that small file are collected all the cryptic error messages codes you might find in your logs when you're first time configuring the Haproxy frontend / backend and the backend application behind.

Another useful analyze tool which can be used to analyze Layer 7 HTTP traffic is halog for more on it just google around.

Mail send from command line on Linux and *BSD servers – useful for scripting

Monday, September 10th, 2018

mail-send-email-from-command-line-on-linux-and-freebsd-operating-systems-logo

Historically Email sending has been very different from what most people use it in the Office, there was no heavy Email clients such as Outlook Express no MX Exchange, no e-mail client capabilities for Calendar and Meetings schedule as it is in most of the modern corporate offices that depend on products such as Office 365 (I would call it a connectedHell 365 days a year !).

There was no free webmail and pop3 / imap providers such as Mail.Yahoo.com, Gmail.com, Hotmail.com, Yandex.com, RediffMail, Mail.com the innumerous lists goes and on.
Nope back in the day emails were doing what they were originally supposed to like the post services in real life simply send and receive messages.

For those who remember that charming times, people used to be using BBS-es (which were basicly a shared set-up home system as a server) or some of the few University Internal Email student accounts or by crazy sysadmins who received their notification and warnings logs about daemon (services) messages via local DMZ-ed network email servers and it was common to read the email directly with mail (mailx) text command or custom written scripts … It was not uncommon also that mailx was used heavily to send notification messages on triggered events from logs. Oh life was simple and clear back then, and even though today the email could be used in a similar fashion by hard-core old school sysadmins and Dev Ops / simple shell scriptings tasks or report cron jobs such usage is already in the deep history.

The number of ways one could send email in text format directly from the GNU / Linux / *BSD server to another remote mail MTA node (assuming it had properly configured Relay server be it Exim or Postifix) were plenty.

In this article I will try to rewind back some of the UNIX history by pinpointing a few of the most common ways, one used to send quick emails directly from a remote server connection terminal or lets say a cheap VPS few cents server, through something like (SSH or Telnet) etc.
 

1. Using the mail command client (part of bsd-mailx on Debian).
 

In my previous article Linux: "bash mail command not found" error fix
I ended the article with a short explanation on how this is done but I will repeat myself one more time here for the sake of clearness of this article.

root@linux:~# echo "Your Sample Message Body" | mail -s "Whatever … Message Subject" remote_receiver@remote-server-email-address.com


The mail command will connect to local server TCP PORT 25 on local configured MTA and send via it. If the local MTA is misconfigured or it doesn't have a proper MX / PTR DNS records etc. or not configure as a relay SMTP remote mail will not get delivered. Sent Email should be properly delivered at remote recipient address.

How to send HTML formatted emails using mailx command on Linux console / terminal shell using remote server through SSH ?

Connect to remote SSH server (VPS), dedicated server, home Linux router etc. and run:

 

root@linux:~# mailx -a 'Content-Type: text/html'
      -s "This is advanced mailx indeed!" < email_content.html
      "first_email_to_send_to@gmail.com, mail_recipient_2@yahoo.com"

 


email_content.html should be properly formatted (at best w3c standard compliant) HTML.

Here is an example email_content.html (skeleton file)

 

    To: your_customer@gmail.com
    Subject: This is an HTML message
    From: marketing@your_company.com
    Content-Type: text/html; charset="utf8"

    <html>
    <body>
    <div style="
        background-color:
        #abcdef; width: 300px;
        height: 300px;
        ">
    </div>
Whatever text mixed with valid email HTML tags here.
    </body>
    </html>


Above command sends to two email addresses however if you have a text formatted list of recipients you can easily use that file with a bash shell script for loop and send to multiple addresses red from lets say email_addresses_list.txt .

To further advance the one liner you can also want to provide an email attachment, lets say the file email_archive.rar by using the -A email_archive.rar argument.

 

root@linux:~# mailx -a 'Content-Type: text/html'
      -s "This is advanced mailx indeed!" -A ~/email_archive.rar < email_content.html
      "first_email_to_send_to@gmail.com, mail_recipient_2@yahoo.com"

 

For those familiar with Dan Bernstein's Qmail MTA (which even though a bit obsolete is still a Security and Stability Beast across email servers) – mailx command had to be substituted with a custom qmail one in order to be capable to send via qmail MTA daemon.
 

2. Using sendmail command to send email
 

Do you remember that heavy hard to configure MTA monster sendmail ? It was and until this very day is the default Mail Transport Agent for Slackware Linux.

Here is how we were supposed to send mail with it:

 

[root@sendmail-host ~]# vim email_content_to_be_delivered.txt

 

Content of file should be something like:

Subject: This Email is sent from UNIX Terminal Email

Hi this Email was typed in a file and send via sendmail console email client
(part of the sendmail mail server)

It is really fun to go back in the pre-history of Mail Content creation 🙂

 

[root@sendmail-host ~]# sendmail -v user_name@remote-mail-domain.com  < /tmp/email_content_to_be_delivered.txt

 

-v argument provided, will make the communication between the mail server and your mail transfer agent visible.
 

3. Using ssmtp command to send mail
 

ssmtp MTA and its included shell command was used historically as it was pretty straight forward you just launch it on the command line type on one line all your email and subject and ship it (by pressing the CTRL + D key combination).

To give it a try you can do:

 

root@linux:~# apt-get install ssmtp
Reading package lists… Done
Building dependency tree       
Reading state information… Done
The following additional packages will be installed:
  libgnutls-openssl27
The following packages will be REMOVED:
  exim4-base exim4-config exim4-daemon-heavy
The following NEW packages will be installed:
  libgnutls-openssl27 ssmtp
0 upgraded, 2 newly installed, 3 to remove and 1 not upgraded.
Need to get 239 kB of archives.
After this operation, 3,697 kB disk space will be freed.
Do you want to continue? [Y/n] Y
Get:1 http://ftp.us.debian.org/debian stretch/main amd64 ssmtp amd64 2.64-8+b2 [54.2 kB]
Get:2 http://ftp.us.debian.org/debian stretch/main amd64 libgnutls-openssl27 amd64 3.5.8-5+deb9u3 [184 kB]
Fetched 239 kB in 2s (88.5 kB/s)         
Preconfiguring packages …
dpkg: exim4-daemon-heavy: dependency problems, but removing anyway as you requested:
 mailutils depends on default-mta | mail-transport-agent; however:
  Package default-mta is not installed.
  Package mail-transport-agent is not installed.
  Package exim4-daemon-heavy which provides mail-transport-agent is to be removed.

 

(Reading database … 169307 files and directories currently installed.)
Removing exim4-daemon-heavy (4.89-2+deb9u3) …
dpkg: exim4-config: dependency problems, but removing anyway as you requested:
 exim4-base depends on exim4-config (>= 4.82) | exim4-config-2; however:
  Package exim4-config is to be removed.
  Package exim4-config-2 is not installed.
  Package exim4-config which provides exim4-config-2 is to be removed.
 exim4-base depends on exim4-config (>= 4.82) | exim4-config-2; however:
  Package exim4-config is to be removed.
  Package exim4-config-2 is not installed.
  Package exim4-config which provides exim4-config-2 is to be removed.

Removing exim4-config (4.89-2+deb9u3) …
Selecting previously unselected package ssmtp.
(Reading database … 169247 files and directories currently installed.)
Preparing to unpack …/ssmtp_2.64-8+b2_amd64.deb …
Unpacking ssmtp (2.64-8+b2) …
(Reading database … 169268 files and directories currently installed.)
Removing exim4-base (4.89-2+deb9u3) …
Selecting previously unselected package libgnutls-openssl27:amd64.
(Reading database … 169195 files and directories currently installed.)
Preparing to unpack …/libgnutls-openssl27_3.5.8-5+deb9u3_amd64.deb …
Unpacking libgnutls-openssl27:amd64 (3.5.8-5+deb9u3) …
Processing triggers for libc-bin (2.24-11+deb9u3) …
Setting up libgnutls-openssl27:amd64 (3.5.8-5+deb9u3) …
Setting up ssmtp (2.64-8+b2) …
Processing triggers for man-db (2.7.6.1-2) …
Processing triggers for libc-bin (2.24-11+deb9u3) …

 

As you see from above output local default Debian Linux Exim is removed …

Lets send a simple test email …

 

hipo@linux:~# ssmtp user@remote-mail-server.com
Subject: Simply Test SSMTP Email
This Email was send just as a test using SSMTP obscure client
via SMTP server.
^d

 

What is notable about ssmtp is that even though so obsolete today it supports of STARTTLS (email communication encryption) that is done via its config file

 

/etc/ssmtp/ssmtp.conf

 

4. Send Email from terminal using Mutt client
 

Mutt was and still is one of the swiff army of most used console text email clients along with Alpine and Fetchmail to know more about it read here

Mutt supports reading / sending mail from multiple mailboxes and capable of reading IMAP and POP3 mail fetch protocols and was a serious step forward over mailx. Its syntax pretty much resembles mailx cmds.

 

root@linux:~# mutt -s "Test Email" user@example.com < /dev/null

 

Send email including attachment a 15 megabytes MySQL backup of Squirrel Webmail

 

root@linux:~# mutt  -s "This is last backup small sized database" -a /home/backups/backup_db.sql user@remote-mail-server.com < /dev/null

 


5. Using simple telnet to test and send email (verify existence of email on remote SMTP)
 

As a Mail Server SysAdmin this is one of my best ways to test whether I had a server properly configured and even sometimes for the sake of fun I used it as a hack to send my mail 🙂
telnet is and will always be a great tool for doing SMTP issues troubleshooting.
 

It is very useful to test whether a remote SMTP TCP port 25 is opened or a local / remote server firewall prevents connections to MTA.

Below is an example connect and send example using telnet to my local SMTP on www.pc-freak.net (QMail powered (R) 🙂 )

sending-email-using-telnet-command-howto-screenshot

 

root@pcfreak:~# telnet localhost 25
Trying 127.0.0.1…
Connected to localhost.
Escape character is '^]'.
220 This is Mail Pc-Freak.NET ESMTP
HELO mail.www.pc-freak.net
250 This is Mail Pc-Freak.NET
MAIL FROM:<hipo@www.pc-freak.net>
250 ok
RCPT TO:<roots_bg@yahoo.com>
250 ok
DATA
354 go ahead
Subject: This is a test subject

 

This is just a test mail send through telnet
.
250 ok 1536440787 qp 28058
^]
telnet>

 

Note that the returned messages are native to qmail, a postfix would return a slightly different content, here is another test example to remote SMTP running sendmail or postfix.

 

root@pcfreak:~# telnet mail.servername.com 25
Trying 127.0.0.1…
Connected to localhost.localdomain (127.0.0.1).
Escape character is '^]'.
220 mail.servername.com ESMTP Sendmail 8.13.8/8.13.8; Tue, 22 Oct 2013 05:05:59 -0400
HELO yahoo.com
250 mail.servername.com Hello mail.servername.com [127.0.0.1], pleased to meet you
mail from: systemexec@gmail.com
250 2.1.0 hipo@www.pc-freak.net… Sender ok
rcpt to: hip0d@yandex.ru
250 2.1.5 hip0d@yandex.ru… Recipient ok
data
354 Enter mail, end with "." on a line by itself
Hey
This is test email only

 

Thanks
.
250 2.0.0 r9M95xgc014513 Message accepted for delivery
quit
221 2.0.0 mail.servername.com closing connection
Connection closed by foreign host.


It is handy if you want to know whether remote MTA server has a certain Emailbox existing or not with telnet by simply trying to send to a certian email and checking the Email server returned output (note that the message returned depends on the remote MTA version and many qmails are configured to not give information on the initial SMTP handshake but returns instead a MAILER DAEMON failure error sent back to your sender address. Some MX servrers are still vulnerable to this attack yet, historically dreamhost.com. Below attack screenshot is made at the times before dreamhost.com fixed the brute force email issue.

Terminal-Verify-existing-Email-with-telnet

6. Using simple netcat TCP/IP Swiss Army Knife to test and send email in console

netcat-logo-a-swiff-army-knife-of-the-hacker-and-security-expert-logo
Other tool besides telnet of testing remote / local SMTP is netcat tool (for reading and writting data across TCP and UDP connections).

The way to do it is analogous but since netcat is not present on most Linux OSes by default you need to install it through the package manager first be it apt or yum etc.

# apt-get –yes install netcat


 

First lets create a new file test_email_content.txt using bash's echo cmd.
 

 

# echo 'EHLO hostname
MAIL FROM: hip0d@yandex.ru
RCPT TO:   solutions@www.pc-freak.net
DATA
From: A tester <hip0d@yandex.ru>
To:   <solutions@www.pc-freak.net>
Date: date
Subject: A test message from test hostname

 

Delete me, please
.
QUIT
' >>test_email_content.txt

 

# netcat -C localhost 25 < test_email_content.txt

 

220 This is Mail Pc-Freak.NET ESMTP
250-This is Mail Pc-Freak.NET
250-STARTTLS
250-SIZE 80000000
250-PIPELINING
250 8BITMIME
250 ok
250 ok
354 go ahead
451 See http://pobox.com/~djb/docs/smtplf.html.

Because of its simplicity and the fact it has a bit more capabilities in reading / writing data over network it was no surprise it was among the favorite tools not only of crackers and penetration testers but also a precious debug tool for the avarage sysadmin. netcat's advantage over telnet is you can push-pull over the remote SMTP port (25) a non-interactive input.


7. Using openssl to connect and send email via encrypted channel

 

root@linux:~# openssl s_client -connect smtp.gmail.com:465 -crlf -ign_eof

    ===
               Certificate negotiation output from openssl command goes here
        ===

        220 smtp.gmail.com ESMTP j92sm925556edd.81 – gsmtp
            EHLO localhost
        250-smtp.gmail.com at your service, [78.139.22.28]
        250-SIZE 35882577
        250-8BITMIME
        250-AUTH LOGIN PLAIN XOAUTH2 PLAIN-CLIENTTOKEN OAUTHBEARER XOAUTH
        250-ENHANCEDSTATUSCODES
        250-PIPELINING
        250-CHUNKING
        250 SMTPUTF8
            AUTH PLAIN *passwordhash*
        235 2.7.0 Accepted
            MAIL FROM: <hipo@pcfreak.org>
        250 2.1.0 OK j92sm925556edd.81 – gsmtp
            rcpt to: <systemexec@gmail.com>
        250 2.1.5 OK j92sm925556edd.81 – gsmtp
            DATA
        354  Go ahead j92sm925556edd.81 – gsmtp
            Subject: This is openssl mailing

            Hello nice user
            .
        250 2.0.0 OK 1339757532 m46sm11546481eeh.9
            quit
        221 2.0.0 closing connection m46sm11546481eeh.9
        read:errno=0


8. Using CURL (URL transfer) tool to send SSL / TLS secured crypted channel emails via Gmail / Yahoo servers and MailGun Mail send API service


Using curl webpage downloading advanced tool for managing email send might be  a shocking news to many as it is idea is to just transfer data from a server.
curl is mostly used in conjunction with PHP website scripts for the reason it has a Native PHP implementation and many PHP based websites widely use it for download / upload of user data.
Interestingly besides support for HTTP and FTP it has support for POP3 and SMTP email protocols as well
If you don't have it installed on your server and you want to give it a try, install it first with apt:
 

root@linux:~# apt-get install curl

 


To learn more about curl capabilities make sure you check cURL –manual arg.
 

root@linux:~# curl –manual

 

a) Sending Emails via Gmail and other Mail Public services

Curl is capable to send emails from terminal using Gmail and Yahoo Mail services, if you want to give that a try.

gmail-settings-google-allow-less-secure-apps-sign-in-to-google-screenshot

Go to myaccount.google.com URL and login from the web interface choose Sign in And Security choose Allow less Secure Apps to be -> ON and turn on access for less secure apps in Gmail. Though I have not tested it myself so far with Yahoo! Mail, I suppose it should have a similar security settings somewhere.

Here is how to use curl to send email via Gmail.

Gmail-password-Allow-less-secure-apps-ON-screenshot-howto-to-be-able-to-send-email-with-text-commands-with-encryption-and-outlook

 

 

root@linux:~# curl –url 'smtps://smtp.gmail.com:465' –ssl-reqd \
  –mail-from 'your_email@gmail.com' –mail-rcpt 'remote_recipient@mail.com' \
  –upload-file mail.txt –user 'your_email@gmail.com:your_accout_password'


b) Sending Emails using Mailgun.com (Transactional Email Service API for developers)

To use Mailgun to script sending automated emails go to Mailgun.com and create account and generate new API key.

Then use curl in a similar way like below example:

 

curl -sv –user 'api:key-7e55d003b…f79accd31a' \
    https://api.mailgun.net/v3/sandbox21a78f824…3eb160ebc79.mailgun.org/messages \
    -F from='Excited User <developer@yourcompany.com>' \
    -F to=sandbox21a78f824…3eb160ebc79.mailgun.org \
    -F to=user_acc@gmail.com \
    -F subject='Hello' \
    -F text='Testing Mailgun service!' \
   –form-string html='<h1>EDMdesigner Blog</h1><br /><cite>This tutorial helps me understand email sending from Linux console</cite>' \
    -F attachment=@logo_picture.jpg

 

The -F option that is heavy present in above command lets curl (Emulate a form filled in button in which user has pressed the submit button).
For more info of the options check out man curl.
 

 

9. Using swaks command to send emails from

 

root@linux:~# apt-cache show swaks|grep "Description" -B 10
Package: swaks
Version: 20170101.0-1
Installed-Size: 221
Maintainer: Andreas Metzler <ametzler@debian.org>
Architecture: all
Depends: perl
Recommends: libnet-dns-perl, libnet-ssleay-perl
Suggests: perl-doc, libauthen-sasl-perl, libauthen-ntlm-perl
Description-en: SMTP command-line test tool
 swaks (Swiss Army Knife SMTP) is a command-line tool written in Perl
 for testing SMTP setups; it supports STARTTLS and SMTP AUTH (PLAIN,
 LOGIN, CRAM-MD5, SPA, and DIGEST-MD5). swaks allows one to stop the
 SMTP dialog at any stage, e.g to check RCPT TO: without actually
 sending a mail.
 .
 If you are spending too much time iterating "telnet foo.example 25"
 swaks is for you.
Description-md5: f44c6c864f0f0cb3896aa932ce2bdaa8

 

 

 

root@linux:~# apt-get instal –yes swaks

root@linux:~# swaks –to mailbox@example.com -s smtp.gmail.com:587
      -tls -au <user-account> -ap <account-password>

 


The -tls argument (in order to use gmail encrypted TLS channel on port 587)

If you want to hide the password not to provide the password from command line so (in order not to log it to user history) add the -a options.

10. Using qmail-inject on Qmail mail servers to send simple emails

Create new file with content like:
 

root@qmail:~# vim email_file_content.text
To: user@mail-example.com
Subject: Test


This is a test message.
 

root@qmail:~# cat email_file_content.text | /var/qmail/bin/qmail-inject


qmail-inject is part of ordinary qmail installation so it is very simple it even doesn't return error codes it just ships what ever given as content to remote MTA.
If the linux host where you invoke it has a properly configured qmail installation the email will get immediately delivered. The advantage of qmail-inject over the other ones is it is really lightweight and will deliver the simple message more quickly than the the prior heavy tools but again it is more a Mail Delivery Agent (MDA) for quick debugging, if MTA is not working, than for daily email writting.

It is very useful to simply test whether email send works properly without sending any email content by (I used qmail-inject to test local email delivery works like so).
 

root@linux:~# echo 'To: mailbox_acc@mail-server.com' | /var/qmail/bin/qmail-inject

 

11. Debugging why Email send with text tool is not being send properly to remote recipient

If you use some of the above described methods and email is not delivered to remote recipient email addresses check /var/log/mail.log (for a general email log and postfix MTAs – the log is present on many of the Linux distributions) and /var/log/messages or /var/log/qmal (on Qmail installations) /var/log/exim4 (on servers running Exim as MTA).

https://www.pc-freak.net/images/linux-email-log-debug-var-log-mail-output

 Closure

The ways to send email via Linux terminal are properly innumerous as there are plenty of scripted tools in various programming languages, I am sure in this article,  also missing a lot of pre-bundled installable distro packages. If you know other interesting ways / tools to send via terminal I would like to hear it.

Hope you enjoyed, happy mailing !

xorg on Toshiba Satellite L40 14B with Intel GM965 video hangs up after boot and the worst fix ever / How to reinstall Ubuntu by keeping the old personal data and programs

Wednesday, April 27th, 2011

black screen ubuntu troubles

I have updated Ubuntu version 9.04 (Jaunty) to 9.10 and followed the my previous post update ubuntu from 9.04 to Latest Ubuntu

I expected that a step by step upgrade from a release to release will work like a charm and though it does on many notebooks it doesn't on Toshiba Satellite L40

The update itself went fine, whether I used the update-manager -d and followed the above pointed tutorial, however after a system restart the PC failed to boot the X server properly, a completely blank screen with blinking cursor appeared and that was all.

I restarted the system into the 2.6.35-28-generic kernel rescue-mode recovery kernel in order to be able to enter into physical console.

Logically the first thing I did is to check /var/log/messages and /var/log/Xorg.0.log but I couldn't find nothing unusual or wrong there.

I suspected something might be wrong with /etc/X11/xorg.conf so I deleted it:

ubuntu:~# rm -f /etc/X11/xorg.conf

and attempted to re-create the xorg.conf X configuration with command:

ubuntu:~# dpkg-reconfigure xserver-xorg

This command was reported to be the usual way to reconfigure the X server settings from console, but in my case (for unknown reasons) it did nothing.

Next the command which was able to re-generate the xorg.conf file was:

ubuntu:~# X -configure

The command generates a xorg.conf sample file in /root/xorg.conf.* so I used the conf to put it in /etc/X11/xorg.conf X's default location and restarted in hope that this would fix the non-booting issue.

Very sadly again the black screen of death appeared on the notebook toshiba screen.
I further thought of completely wipe out the xorg.conf in hope that at least it might boot without the conf file but this worked out neither.

I attempted to run the Xserver with a xorg.conf configured to work with vesa as it's well known vesa X server driver is supposed to work on 99% of the video cards, as almost all of them nowdays are compatible with the vesa standard, but guess what in my case vesa worked not!

The only version of X I can boot in was the failsafe X screen mode which is available through the grub's boot menu recovery mode.

Further on I decided to try few xorg.conf which I found online and were reported to work fine with Intel GM965 internal video , and yes this was also unsucessful.

Some of my other futile attempts were: to re-install the xorg server with apt-get, reinstall the xserver-xorg-video-intel driver e.g.:

ubuntu:~# apt-get install --reinstall xserver-xorg xserver-xorg-video-intel

As nothing worked out I was completely pissed off and decided to take an alternative approach which will take a lot of time but at least will probably be succesful, I decided to completely re-install the Ubuntu from a CD after backing up the /home directory and making a list of available packages on the system, so I can further easily run a tiny bash one-liner script to install all the packages which were previously existing on the laptop before the re-install:

Here is how I did it:

First I archived the /home directory:

ubuntu:/# tar -czvf home.tar.gz home/
....

For 12GB of data with some few thousands of files archiving it took about 40 minutes.

The tar spit archive became like 9GB and I hence used sftp to upload it to a remote FTP server as I was missing a flash drive or an external HDD where I can place the just archived data.

Uploading with sftp can be achieved with a command similar to:

sftp user@yourhost.com
Password:
Connected to yourhost.com.
sftp> put home.tar.gz

As a next step to backup in a file the list of all current installed packages, before I can further proceed to boot-up with the Ubuntu Maverich 10.10 CD and prooceed with the fresh install I used command:

for i in $(dpkg -l| awk '{ print $2 }'); do
echo $i; done >> my_current_ubuntu_packages.txt

Once again I used sftp as in above example to upload my_current_update_packages.txt file to my FTP host.

After backing up all the stuff necessery, I restarted the system and booted from the CD-rom with Ubuntu.
The Ubuntu installation as usual is more than a piece of cake and even if you don't have a brain you can succeed with it, so I wouldn't comment on it 😉

Right after the installation I used the sftp client once again to fetch the home.tar.gz and my_current_ubuntu_packages.txt

I placed the home.tar.gz in /home/ and untarred it inside the fresh /home dir:

ubuntu:/home# tar -zxvf home.tar.gz

Eventually the old home directory was located in /home/home so thereon I used Midnight Commander ( the good old mc text file explorer and manager ) to restore the important user files to their respective places.

As a last step I used the my_current_ubuntu_packages.txt in combination with a tiny shell script to install all the listed packages inside the file with command:

ubuntu:~# for i in $(cat my_current_ubuntu_packagespackages.txt); do
apt-get install --yes $i; sleep 1;
done

You will have to stay in front of the computer and manually answer a ncurses interface questions concerning some packages configuration and to be honest this is really annoying and time consuming.

Summing up the overall time I spend with this stupid Toshiba Satellite L40 with the shitty Intel GM965 was 4 days, where each day I tried numerous ways to fix up the X and did my best to get through the blank screen xserver non-bootable issue, without a complete re-install of the old Ubuntu system.
This is a lesson for me that if I stumble such a shitty issues I will straight proceed to the re-install option and not loose my time with non-sense fixes which would never work.

Hope the article might be helpful to somebody else who experience some problems with Linux similar to mine.

After all at least the Ubuntu Maverick 10.10 is really good looking in general from a design perspective.
What really striked me was the placement of the close, minimize and maximize window buttons , it seems in newer Ubuntus the ubuntu guys decided to place the buttons on the left, here is a screenshot:

Left button positioning of navigation Buttons in Ubuntu 10.10

I believe the solution I explain, though very radical and slow is a solution that would always work and hence worthy 😉
Let me hear from you if the article was helpful.

Resolving “nf_conntrack: table full, dropping packet.” flood message in dmesg Linux kernel log

Wednesday, March 28th, 2012

nf_conntrack_table_full_dropping_packet
On many busy servers, you might encounter in /var/log/syslog or dmesg kernel log messages like

nf_conntrack: table full, dropping packet

to appear repeatingly:

[1737157.057528] nf_conntrack: table full, dropping packet.
[1737157.160357] nf_conntrack: table full, dropping packet.
[1737157.260534] nf_conntrack: table full, dropping packet.
[1737157.361837] nf_conntrack: table full, dropping packet.
[1737157.462305] nf_conntrack: table full, dropping packet.
[1737157.564270] nf_conntrack: table full, dropping packet.
[1737157.666836] nf_conntrack: table full, dropping packet.
[1737157.767348] nf_conntrack: table full, dropping packet.
[1737157.868338] nf_conntrack: table full, dropping packet.
[1737157.969828] nf_conntrack: table full, dropping packet.
[1737157.969928] nf_conntrack: table full, dropping packet
[1737157.989828] nf_conntrack: table full, dropping packet
[1737162.214084] __ratelimit: 83 callbacks suppressed

There are two type of servers, I've encountered this message on:

1. Xen OpenVZ / VPS (Virtual Private Servers)
2. ISPs – Internet Providers with heavy traffic NAT network routers
 

I. What is the meaning of nf_conntrack: table full dropping packet error message

In short, this message is received because the nf_conntrack kernel maximum number assigned value gets reached.
The common reason for that is a heavy traffic passing by the server or very often a DoS or DDoS (Distributed Denial of Service) attack. Sometimes encountering the err is a result of a bad server planning (incorrect data about expected traffic load by a company/companeis) or simply a sys admin error…

– Checking the current maximum nf_conntrack value assigned on host:

linux:~# cat /proc/sys/net/ipv4/netfilter/ip_conntrack_max
65536

– Alternative way to check the current kernel values for nf_conntrack is through:

linux:~# /sbin/sysctl -a|grep -i nf_conntrack_max
error: permission denied on key 'net.ipv4.route.flush'
net.netfilter.nf_conntrack_max = 65536
error: permission denied on key 'net.ipv6.route.flush'
net.nf_conntrack_max = 65536

– Check the current sysctl nf_conntrack active connections

To check present connection tracking opened on a system:

:

linux:~# /sbin/sysctl net.netfilter.nf_conntrack_count
net.netfilter.nf_conntrack_count = 12742

The shown connections are assigned dynamicly on each new succesful TCP / IP NAT-ted connection. Btw, on a systems that work normally without the dmesg log being flooded with the message, the output of lsmod is:

linux:~# /sbin/lsmod | egrep 'ip_tables|conntrack'
ip_tables 9899 1 iptable_filter
x_tables 14175 1 ip_tables

On servers which are encountering nf_conntrack: table full, dropping packet error, you can see, when issuing lsmod, extra modules related to nf_conntrack are shown as loaded:

linux:~# /sbin/lsmod | egrep 'ip_tables|conntrack'
nf_conntrack_ipv4 10346 3 iptable_nat,nf_nat
nf_conntrack 60975 4 ipt_MASQUERADE,iptable_nat,nf_nat,nf_conntrack_ipv4
nf_defrag_ipv4 1073 1 nf_conntrack_ipv4
ip_tables 9899 2 iptable_nat,iptable_filter
x_tables 14175 3 ipt_MASQUERADE,iptable_nat,ip_tables

 

II. Remove completely nf_conntrack support if it is not really necessery

It is a good practice to limit or try to omit completely use of any iptables NAT rules to prevent yourself from ending with flooding your kernel log with the messages and respectively stop your system from dropping connections.

Another option is to completely remove any modules related to nf_conntrack, iptables_nat and nf_nat.
To remove nf_conntrack support from the Linux kernel, if for instance the system is not used for Network Address Translation use:

/sbin/rmmod iptable_nat
/sbin/rmmod ipt_MASQUERADE
/sbin/rmmod rmmod nf_nat
/sbin/rmmod rmmod nf_conntrack_ipv4
/sbin/rmmod nf_conntrack
/sbin/rmmod nf_defrag_ipv4

Once the modules are removed, be sure to not use iptables -t nat .. rules. Even attempt to list, if there are any NAT related rules with iptables -t nat -L -n will force the kernel to load the nf_conntrack modules again.

Btw nf_conntrack: table full, dropping packet. message is observable across all GNU / Linux distributions, so this is not some kind of local distribution bug or Linux kernel (distro) customization.
 

III. Fixing the nf_conntrack … dropping packets error

– One temporary, fix if you need to keep your iptables NAT rules is:

linux:~# sysctl -w net.netfilter.nf_conntrack_max=131072

I say temporary, because raising the nf_conntrack_max doesn't guarantee, things will get smoothly from now on.
However on many not so heavily traffic loaded servers just raising the net.netfilter.nf_conntrack_max=131072 to a high enough value will be enough to resolve the hassle.

– Increasing the size of nf_conntrack hash-table

The Hash table hashsize value, which stores lists of conntrack-entries should be increased propertionally, whenever net.netfilter.nf_conntrack_max is raised.

linux:~# echo 32768 > /sys/module/nf_conntrack/parameters/hashsize
The rule to calculate the right value to set is:
hashsize = nf_conntrack_max / 4

– To permanently store the made changes ;a) put into /etc/sysctl.conf:

linux:~# echo 'net.netfilter.nf_conntrack_count = 131072' >> /etc/sysctl.conf
linux:~# /sbin/sysct -p

b) put in /etc/rc.local (before the exit 0 line):

echo 32768 > /sys/module/nf_conntrack/parameters/hashsize

Note: Be careful with this variable, according to my experience raising it to too high value (especially on XEN patched kernels) could freeze the system.
Also raising the value to a too high number can freeze a regular Linux server running on old hardware.

– For the diagnosis of nf_conntrack stuff there is ;

/proc/sys/net/netfilter kernel memory stored directory. There you can find some values dynamically stored which gives info concerning nf_conntrack operations in "real time":

linux:~# cd /proc/sys/net/netfilter
linux:/proc/sys/net/netfilter# ls -al nf_log/

total 0
dr-xr-xr-x 0 root root 0 Mar 23 23:02 ./
dr-xr-xr-x 0 root root 0 Mar 23 23:02 ../
-rw-r--r-- 1 root root 0 Mar 23 23:02 0
-rw-r--r-- 1 root root 0 Mar 23 23:02 1
-rw-r--r-- 1 root root 0 Mar 23 23:02 10
-rw-r--r-- 1 root root 0 Mar 23 23:02 11
-rw-r--r-- 1 root root 0 Mar 23 23:02 12
-rw-r--r-- 1 root root 0 Mar 23 23:02 2
-rw-r--r-- 1 root root 0 Mar 23 23:02 3
-rw-r--r-- 1 root root 0 Mar 23 23:02 4
-rw-r--r-- 1 root root 0 Mar 23 23:02 5
-rw-r--r-- 1 root root 0 Mar 23 23:02 6
-rw-r--r-- 1 root root 0 Mar 23 23:02 7
-rw-r--r-- 1 root root 0 Mar 23 23:02 8
-rw-r--r-- 1 root root 0 Mar 23 23:02 9

 

IV. Decreasing other nf_conntrack NAT time-out values to prevent server against DoS attacks

Generally, the default value for nf_conntrack_* time-outs are (unnecessery) large.
Therefore, for large flows of traffic even if you increase nf_conntrack_max, still shorty you can get a nf_conntrack overflow table resulting in dropping server connections. To make this not happen, check and decrease the other nf_conntrack timeout connection tracking values:

linux:~# sysctl -a | grep conntrack | grep timeout
net.netfilter.nf_conntrack_generic_timeout = 600
net.netfilter.nf_conntrack_tcp_timeout_syn_sent = 120
net.netfilter.nf_conntrack_tcp_timeout_syn_recv = 60
net.netfilter.nf_conntrack_tcp_timeout_established = 432000
net.netfilter.nf_conntrack_tcp_timeout_fin_wait = 120
net.netfilter.nf_conntrack_tcp_timeout_close_wait = 60
net.netfilter.nf_conntrack_tcp_timeout_last_ack = 30
net.netfilter.nf_conntrack_tcp_timeout_time_wait = 120
net.netfilter.nf_conntrack_tcp_timeout_close = 10
net.netfilter.nf_conntrack_tcp_timeout_max_retrans = 300
net.netfilter.nf_conntrack_tcp_timeout_unacknowledged = 300
net.netfilter.nf_conntrack_udp_timeout = 30
net.netfilter.nf_conntrack_udp_timeout_stream = 180
net.netfilter.nf_conntrack_icmp_timeout = 30
net.netfilter.nf_conntrack_events_retry_timeout = 15
net.ipv4.netfilter.ip_conntrack_generic_timeout = 600
net.ipv4.netfilter.ip_conntrack_tcp_timeout_syn_sent = 120
net.ipv4.netfilter.ip_conntrack_tcp_timeout_syn_sent2 = 120
net.ipv4.netfilter.ip_conntrack_tcp_timeout_syn_recv = 60
net.ipv4.netfilter.ip_conntrack_tcp_timeout_established = 432000
net.ipv4.netfilter.ip_conntrack_tcp_timeout_fin_wait = 120
net.ipv4.netfilter.ip_conntrack_tcp_timeout_close_wait = 60
net.ipv4.netfilter.ip_conntrack_tcp_timeout_last_ack = 30
net.ipv4.netfilter.ip_conntrack_tcp_timeout_time_wait = 120
net.ipv4.netfilter.ip_conntrack_tcp_timeout_close = 10
net.ipv4.netfilter.ip_conntrack_tcp_timeout_max_retrans = 300
net.ipv4.netfilter.ip_conntrack_udp_timeout = 30
net.ipv4.netfilter.ip_conntrack_udp_timeout_stream = 180
net.ipv4.netfilter.ip_conntrack_icmp_timeout = 30

All the timeouts are in seconds. net.netfilter.nf_conntrack_generic_timeout as you see is quite high – 600 secs = (10 minutes).
This kind of value means any NAT-ted connection not responding can stay hanging for 10 minutes!

The value net.netfilter.nf_conntrack_tcp_timeout_established = 432000 is quite high too (5 days!)
If this values, are not lowered the server will be an easy target for anyone who would like to flood it with excessive connections, once this happens the server will quick reach even the raised up value for net.nf_conntrack_max and the initial connection dropping will re-occur again …

With all said, to prevent the server from malicious users, situated behind the NAT plaguing you with Denial of Service attacks:

Lower net.ipv4.netfilter.ip_conntrack_generic_timeout to 60 – 120 seconds and net.ipv4.netfilter.ip_conntrack_tcp_timeout_established to stmh. like 54000

linux:~# sysctl -w net.ipv4.netfilter.ip_conntrack_generic_timeout = 120
linux:~# sysctl -w net.ipv4.netfilter.ip_conntrack_tcp_timeout_established = 54000

This timeout should work fine on the router without creating interruptions for regular NAT users. After changing the values and monitoring for at least few days make the changes permanent by adding them to /etc/sysctl.conf

linux:~# echo 'net.ipv4.netfilter.ip_conntrack_generic_timeout = 120' >> /etc/sysctl.conf
linux:~# echo 'net.ipv4.netfilter.ip_conntrack_tcp_timeout_established = 54000' >> /etc/sysctl.conf

Maximal protection against SSH attacks. If your server has to stay with open SSH (Secure Shell) port open to the world

Thursday, April 7th, 2011

Brute Force Attack SSH screen, Script kiddie attacking
If you’re a a remote Linux many other Unix based OSes, you have defitenily faced the security threat of many failed ssh logins or as it’s better known a brute force attack

During such attacks your /var/log/messages or /var/log/auth gets filled in with various failed password logs like for example:

Feb 3 20:25:50 linux sshd[32098]: Failed password for invalid user oracle from 95.154.249.193 port 51490 ssh2
Feb 3 20:28:30 linux sshd[32135]: Failed password for invalid user oracle1 from 95.154.249.193 port 42778 ssh2
Feb 3 20:28:55 linux sshd[32141]: Failed password for invalid user test1 from 95.154.249.193 port 51072 ssh2
Feb 3 20:30:15 linux sshd[32163]: Failed password for invalid user test from 95.154.249.193 port 47481 ssh2
Feb 3 20:33:20 linux sshd[32211]: Failed password for invalid user testuser from 95.154.249.193 port 51731 ssh2
Feb 3 20:35:32 linux sshd[32249]: Failed password for invalid user user from 95.154.249.193 port 38966 ssh2
Feb 3 20:35:59 linux sshd[32256]: Failed password for invalid user user1 from 95.154.249.193 port 55850 ssh2
Feb 3 20:36:25 linux sshd[32268]: Failed password for invalid user user3 from 95.154.249.193 port 36610 ssh2
Feb 3 20:36:52 linux sshd[32274]: Failed password for invalid user user4 from 95.154.249.193 port 45514 ssh2
Feb 3 20:37:19 linux sshd[32279]: Failed password for invalid user user5 from 95.154.249.193 port 54262 ssh2
Feb 3 20:37:45 linux sshd[32285]: Failed password for invalid user user2 from 95.154.249.193 port 34755 ssh2
Feb 3 20:38:11 linux sshd[32292]: Failed password for invalid user info from 95.154.249.193 port 43146 ssh2
Feb 3 20:40:50 linux sshd[32340]: Failed password for invalid user peter from 95.154.249.193 port 46411 ssh2
Feb 3 20:43:02 linux sshd[32372]: Failed password for invalid user amanda from 95.154.249.193 port 59414 ssh2
Feb 3 20:43:28 linux sshd[32378]: Failed password for invalid user postgres from 95.154.249.193 port 39228 ssh2
Feb 3 20:43:55 linux sshd[32384]: Failed password for invalid user ftpuser from 95.154.249.193 port 47118 ssh2
Feb 3 20:44:22 linux sshd[32391]: Failed password for invalid user fax from 95.154.249.193 port 54939 ssh2
Feb 3 20:44:48 linux sshd[32397]: Failed password for invalid user cyrus from 95.154.249.193 port 34567 ssh2
Feb 3 20:45:14 linux sshd[32405]: Failed password for invalid user toto from 95.154.249.193 port 42350 ssh2
Feb 3 20:45:42 linux sshd[32410]: Failed password for invalid user sophie from 95.154.249.193 port 50063 ssh2
Feb 3 20:46:08 linux sshd[32415]: Failed password for invalid user yves from 95.154.249.193 port 59818 ssh2
Feb 3 20:46:34 linux sshd[32424]: Failed password for invalid user trac from 95.154.249.193 port 39509 ssh2
Feb 3 20:47:00 linux sshd[32432]: Failed password for invalid user webmaster from 95.154.249.193 port 47424 ssh2
Feb 3 20:47:27 linux sshd[32437]: Failed password for invalid user postfix from 95.154.249.193 port 55615 ssh2
Feb 3 20:47:54 linux sshd[32442]: Failed password for www-data from 95.154.249.193 port 35554 ssh2
Feb 3 20:48:19 linux sshd[32448]: Failed password for invalid user temp from 95.154.249.193 port 43896 ssh2
Feb 3 20:48:46 linux sshd[32453]: Failed password for invalid user service from 95.154.249.193 port 52092 ssh2
Feb 3 20:49:13 linux sshd[32458]: Failed password for invalid user tomcat from 95.154.249.193 port 60261 ssh2
Feb 3 20:49:40 linux sshd[32464]: Failed password for invalid user upload from 95.154.249.193 port 40236 ssh2
Feb 3 20:50:06 linux sshd[32469]: Failed password for invalid user debian from 95.154.249.193 port 48295 ssh2
Feb 3 20:50:32 linux sshd[32479]: Failed password for invalid user apache from 95.154.249.193 port 56437 ssh2
Feb 3 20:51:00 linux sshd[32492]: Failed password for invalid user rds from 95.154.249.193 port 45540 ssh2
Feb 3 20:51:26 linux sshd[32501]: Failed password for invalid user exploit from 95.154.249.193 port 53751 ssh2
Feb 3 20:51:51 linux sshd[32506]: Failed password for invalid user exploit from 95.154.249.193 port 33543 ssh2
Feb 3 20:52:18 linux sshd[32512]: Failed password for invalid user postgres from 95.154.249.193 port 41350 ssh2
Feb 3 21:02:04 linux sshd[32652]: Failed password for invalid user shell from 95.154.249.193 port 54454 ssh2
Feb 3 21:02:30 linux sshd[32657]: Failed password for invalid user radio from 95.154.249.193 port 35462 ssh2
Feb 3 21:02:57 linux sshd[32663]: Failed password for invalid user anonymous from 95.154.249.193 port 44290 ssh2
Feb 3 21:03:23 linux sshd[32668]: Failed password for invalid user mark from 95.154.249.193 port 53285 ssh2
Feb 3 21:03:50 linux sshd[32673]: Failed password for invalid user majordomo from 95.154.249.193 port 34082 ssh2
Feb 3 21:04:43 linux sshd[32684]: Failed password for irc from 95.154.249.193 port 50918 ssh2
Feb 3 21:05:36 linux sshd[32695]: Failed password for root from 95.154.249.193 port 38577 ssh2
Feb 3 21:06:30 linux sshd[32705]: Failed password for bin from 95.154.249.193 port 53564 ssh2
Feb 3 21:06:56 linux sshd[32714]: Failed password for invalid user dev from 95.154.249.193 port 34568 ssh2
Feb 3 21:07:23 linux sshd[32720]: Failed password for root from 95.154.249.193 port 43799 ssh2
Feb 3 21:09:10 linux sshd[32755]: Failed password for invalid user bob from 95.154.249.193 port 50026 ssh2
Feb 3 21:09:36 linux sshd[32761]: Failed password for invalid user r00t from 95.154.249.193 port 58129 ssh2
Feb 3 21:11:50 linux sshd[537]: Failed password for root from 95.154.249.193 port 58358 ssh2

This brute force dictionary attacks often succeed where there is a user with a weak a password, or some old forgotten test user account.
Just recently on one of the servers I administrate I have catched a malicious attacker originating from Romania, who was able to break with my system test account with the weak password tset .

Thanksfully the script kiddie was unable to get root access to my system, so what he did is he just started another ssh brute force scanner to crawl the net and look for some other vulnerable hosts.

As you read in my recent example being immune against SSH brute force attacks is a very essential security step, the administrator needs to take on a newly installed server.

The easiest way to get read of the brute force attacks without using some external brute force filtering software like fail2ban can be done by:

1. By using an iptables filtering rule to filter every IP which has failed in logging in more than 5 times

To use this brute force prevention method you need to use the following iptables rules:
linux-host:~# /sbin/iptables -I INPUT -p tcp --dport 22 -i eth0 -m state -state NEW -m recent -set
linux-host:~# /sbin/iptables -I INPUT -p tcp --dport 22 -i eth0 -m state -state NEW
-m recent -update -seconds 60 -hitcount 5 -j DROP

This iptables rules will filter out the SSH port to an every IP address with more than 5 invalid attempts to login to port 22

2. Getting rid of brute force attacks through use of hosts.deny blacklists

sshbl – The SSH blacklist, updated every few minutes, contains IP addresses of hosts which tried to bruteforce into any of currently 19 hosts (all running OpenBSD, FreeBSD or some Linux) using the SSH protocol. The hosts are located in Germany, the United States, United Kingdom, France, England, Ukraine, China, Australia, Czech Republic and setup to report and log those attempts to a central database. Very similar to all the spam blacklists out there.

To use sshbl you will have to set up in your root crontab the following line:

*/60 * * * * /usr/bin/wget -qO /etc/hosts.deny http://www.sshbl.org/lists/hosts.deny

To set it up from console issue:

linux-host:~# echo '*/60 * * * * /usr/bin/wget -qO /etc/hosts.deny http://www.sshbl.org/lists/hosts.deny' | crontab -u root -

These crontab will download and substitute your system default hosts with the one regularly updated on sshbl.org , thus next time a brute force attacker which has been a reported attacker will be filtered out as your Linux or Unix system finds out the IP matches an ip in /etc/hosts.deny

The /etc/hosts.deny filtering rules are written in a way that only publicly known brute forcer IPs will only be filtered for the SSH service, therefore other system services like Apache or a radio, tv streaming server will be still accessible for the brute forcer IP.

It’s a good practice actually to use both of the methods 😉
Thanks to Static (Multics) a close friend of mine for inspiring this article.