Archive for the ‘Linux’ Category

Fix staled NFS on server with dmesg error log nfs: server nfs-server not responding, still trying

Saturday, March 16th, 2019

NFS_Filesystem-fix-staled-NFS-System-dmesg-error-nfs-server-not-responding-still-trying

On a server today I've found to have found a number of NFS mounts mounted through /etc/fstab file definitions that were hanging;
 

nfs-server:~# df -hT


 command kept hanging as well as any attempt to access the mounted NFS directory was not possible.
The server with the hanged Network File System is running SLES (SuSE Enterprise Linux 12 SP3) a short investigation in the kernel logs (dmesg) as well as /var/log/messages reveales following errors:

 

nfs-server:~# dmesg
[3117414.856995] nfs: server nfs-server OK
[3117595.104058] nfs: server nfs-server not responding, still trying
[3117625.032864] nfs: server nfs-server OK
[3117805.280036] nfs: server nfs-server not responding, still trying
[3117835.209110] nfs: server nfs-server OK
[3118015.456045] nfs: server nfs-server not responding, still trying
[3118045.384930] nfs: server nfs-server OK
[3118225.568029] nfs: server nfs-server not responding, still trying
[3118255.560536] nfs: server nfs-server OK
[3118435.808035] nfs: server nfs-server not responding, still trying
[3118465.736463] nfs: server nfs-server OK
[3118645.984057] nfs: server nfs-server not responding, still trying
[3118675.912595] nfs: server nfs-server OK
[3118886.098614] nfs: server nfs-server OK
[3119066.336035] nfs: server nfs-server not responding, still trying
[3119096.274493] nfs: server nfs-server OK
[3119276.512033] nfs: server nfs-server not responding, still trying
[3119306.440455] nfs: server nfs-server OK
[3119486.688029] nfs: server nfs-server not responding, still trying
[3119516.616622] nfs: server nfs-server OK
[3119696.864032] nfs: server nfs-server not responding, still trying
[3119726.792650] nfs: server nfs-server OK
[3119907.040037] nfs: server nfs-server not responding, still trying
[3119936.968691] nfs: server nfs-server OK
[3120117.216053] nfs: server nfs-server not responding, still trying
[3120147.144476] nfs: server nfs-server OK
[3120328.352037] nfs: server nfs-server not responding, still trying
[3120567.496808] nfs: server nfs-server OK
[3121370.592040] nfs: server nfs-server not responding, still trying
[3121400.520779] nfs: server nfs-server OK
[3121400.520866] nfs: server nfs-server OK


It took me a short while to investigate and check the NetApp remote NFS storage filesystem and investigate the Virtual Machine that is running on top of OpenXen Hypervisor system.
The NFS storage permissions of the exported file permissions were checked and they were in a good shape, also a reexport of the NFS mount share was re-exported and on the Linux
mount host the following commands ran to remount the hanged Filesystems:

 

nfs-server:~# umount -f /mnt/nfs_share
nfs-server:~# umount -l /mnt/nfs_share
nfs-server:~# umount -lf /mnt/nfs_share1
nfs-server:~# umount -lf /mnt/nfs_share2
nfs-server:~# mount -t nfs -o remount /mnt/nfs_share


that fixed one of the hanged mount, but as I didn't wanted to manually remount each of the NFS FS-es, I've remounted them all with:

nfs-server:~# mount -a -t nfs


This solved it but, the fix seemed unpermanent as in a time while the issue started reoccuring and I've spend some time
in further investigation on the weird NFS hanging problem has led me to the following blog post where the same problem was described and it was pointed the root cause of it lays
in parameter for MTU which seems to be quite high MTU 9000 and this over the years has prooven to cause problems with NFS especially due to network router (switches) configurations
which seem to have a filters for MTU and are passing only packets with low MTU levels and using rsize / wzise custom mount NFS values in /etc/fstab could lead to this strange NFS hangs.

Below is a list of Maximum Transmission  Unit (MTU) for Media Transport excerpt taken from wikipedia as of time of writting this article.

http://pc-freak.net/images/Maximum-Transmission-Unit-for-Media-Transport-diagram-3.png

In my further research on the issue I've come across this very interesting article which explains a lot on "Large Internet" and Internet Performance

I've used tracepath command which is doing basicly the same as traceroute but could be run without root user and discovers hops (network routers) and shows MTU between path -> destionation.

Below is a sample example

nfs-server:~# tracepath bergon.net
 1?: [LOCALHOST]                      pmtu 1500
 1:  192.168.6.1                                           0.909ms
 1:  192.168.6.1                                           0.966ms
 2:  192.168.222.1                                         0.859ms
 3:  6.192.104.109.bergon.net                              1.138ms reached
     Resume: pmtu 1500 hops 3 back 3

 

Optiomal pmtu for this connection is to be 1500 .traceroute in some cases might return hops with 'no reply' if there is a router UDP  packet filtering implemented on it.

The high MTU value for the Storage network connection interface on eth1 was evident with a simple:

 

 nfs-server:~# /sbin/ifconfig |grep -i eth -A 2
eth0      Link encap:Ethernet  HWaddr 00:16:3E:5C:65:74
          inet addr:100.127.108.56  Bcast:100.127.109.255  Mask:255.255.254.0
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1

eth1      Link encap:Ethernet  HWaddr 00:16:3E:5C:65:76
          inet addr:100.96.80.94  Bcast:100.96.83.255  Mask:255.255.252.0
          UP BROADCAST RUNNING MULTICAST  MTU:9000  Metric:1


The fix was as simple to lower MTU value for eth1 Ethernet interface to 1500 which is the value which most network routers are configured too.

To apply the new MTU to the eth1 interface without restarting the SuSE SLES networking , I first used ifconfig one time with:

 

 nfs-server:~# /sbin/ifconfig eth1 mtu 1500
 nfs-server:~# ip addr show
 …


To make the setting permanent on next  SuSE boot:

I had to set the MTU=1500 value in

 

nfs-server:~#/etc/sysconfig/network/ifcfg-eth1
nfs-server:~#  ip address show eth1
3: eth1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether 8c:89:a5:f2:e8:d8 brd ff:ff:ff:ff:ff:ff
    inet 192.168.0.1/24 brd 192.168.0.255 scope global eth1
       valid_lft forever preferred_lft forever

 


Then to remount the NFS mounted hanged filesystems once again ran:
 

nfs-server:~# mount -a -t nfs


Many network routers keeps the MTU to low as 1500 also because a higher values causes IP packet fragmentation when using NFS over UDP where IP packet fragmentation and packet
reassembly requires significant amount of CPU at both ends of the network connection.
Packet fragmentation also exposes network traffic to greater unreliability, since a complete RPC request must be retransmitted if a UDP packet fragment is dropped for any reason.
Any increase of RPC retransmissions, along with the possibility of increased timeouts, are the single worst impediment to performance for NFS over UDP.
This and many more is very well explained in Optimizing NFS Performance page (which is a must reading) for any sys admin that plans to use NFS frequently.

Even though lowering MTU (Maximum Transmission Union) value does solved my problem at some cases especially in a modern local LANs with Jumbo Frames, allowing and increasing the MTU to 9000 bytes
might be a good idea as this will increase the amount of packet size.and will raise network performance, however as always on distant networks with many router hops keeping MTU value as low as 1492 / 5000 is always a good idea.

 

Squid Proxy log timestamp human readable / Convert and beautify Proxy unixtime logs in human-readable form howto

Thursday, February 21st, 2019

ccze-squid-access-log-colorized-with-log-analizer-linux-tool

If you have installed Squid Cache Proxy recently and you need to watch who is accessing the proxy and what Internet (website is viewed) under /var/log/squid/access.log /var/log/store.log /var/log/access.log etc., you will be unpleasently surprised the log's records are logged in a weird human unreadable format called UTC as Squid Proxy server does not store the date / year / hour time information in a human readable format.

Squid uses the format:
<UNIX timestamp>.<Centiseconds> and you have to be a robot of a kind or a math genious to read it 🙂

To display Squid Proxy log in a human readable, luckily you can use below one-liner  regular expression.
 

cat access.log | perl -p -e 's/^([0-9]*)/”[“.localtime($1).”]"/e'


If you have to review squid logs multiple times and on a regular basis you can either set some kind of cmd alias in $HOME/.bashrc such as:

alias readproxylog='cat access.log | perl -p -e 's/^([0-9]*)/”[“.localtime($1).”]"/e'

Or for those who prefer beauty install and use a log beatifier / colorizer such as ccze
 

root@pcfreak:/home/hipo# apt-cache show ccze|grep -i desc -A 3
Description-en: robust, modular log coloriser
 CCZE is a robust and modular log coloriser, with plugins for apm,
 exim, fetchmail, httpd, postfix, procmail, squid, syslog, ulogd,
 vsftpd, xferlog and more.

Description-md5: 55cd93dbcf614712a4d89cb3489414f6
Homepage: https://github.com/madhouse/ccze
Tag: devel::prettyprint, implemented-in::c, interface::commandline,
 role::program, scope::utility, use::checking, use::filtering,

root@pcfreak:/home/hipo# apt-get install –yes ccze

 

tail -f /var/log/squid/access.loc | ccze -CA


ccze is really nice to view /var/log/syslog errors and make your daily sysadmin life a bit more colorful

 

tail -f -n 200 /var/log/messages | ccze


tail-ccze-syslog-screenshot viewing in Colors your Linux logs

For a frequent tail + ccze usage with ccze you can add to ~/.bashrc following shell small function
 

tailc () { tail $@ | ccze -A }


Below is a list of supported syntax highlighting colorizer:

$ ccze -l
Available plugins:

Name      | Type    | Description
————————————————————
apm       | Partial | Coloriser for APM sub-logs.
distcc    | Full    | Coloriser for distcc(1) logs.
dpkg      | Full    | Coloriser for dpkg logs.
exim      | Full    | Coloriser for exim logs.
fetchmail | Partial | Coloriser for fetchmail(1) sub-logs.
ftpstats  | Full    | Coloriser for ftpstats (pure-ftpd) logs.
httpd     | Full    | Coloriser for generic HTTPD access and error logs.
icecast   | Full    | Coloriser for Icecast(8) logs.
oops      | Full    | Coloriser for oops proxy logs.
php       | Full    | Coloriser for PHP logs.
postfix   | Partial | Coloriser for postfix(1) sub-logs.
procmail  | Full    | Coloriser for procmail(1) logs.
proftpd   | Full    | Coloriser for proftpd access and auth logs.
squid     | Full    | Coloriser for squid access, store and cache logs.
sulog     | Full    | Coloriser for su(1) logs.
super     | Full    | Coloriser for super(1) logs.
syslog    | Full    | Generic syslog(8) log coloriser.
ulogd     | Partial | Coloriser for ulogd sub-logs.
vsftpd    | Full    | Coloriser for vsftpd(8) logs.
xferlog   | Full    | Generic xferlog coloriser.

At many cases for sysadmins like me that prefer clarity over obscurity, even a better solution is to just change in /etc/squid/squid.conf
the logging to turn it in human-readable form
, to do so add to config somewhere:

 

Logformat squid %tl.%03tu %6tr %>a %Ss/%03Hs %


You will get log output in format like:

 

18/Feb/2019:18:38:47 +0200.538 4787 y.y.y.y TCP_MISS/200 41841 GET http://google.com/ – DIRECT/x.x.x.x text/html


SQUID's format recognized parameters in above example are as follows:

 

%    a literal % character
>a    Client source IP address
>A    Client FQDN
>p    Client source port
la    Local IP address (http_port)
lp    Local port number (http_port)
sn    Unique sequence number per log line entry
ts    Seconds since epoch
tu    subsecond time (milliseconds)
tl    Local time. Optional strftime format argument
default %d/%b/%Y:%H:%M:%S %z
tg    GMT time. Optional strftime format argument
default %d/%b/%Y:%H:%M:%S %z
tr    Response time (milliseconds)
dt    Total time spent making DNS lookups (milliseconds)

 

Check the count and monitor of established / time_wait TCP, UDP connections on Linux and Windows with netstat command

Wednesday, February 6th, 2019

netstat-windows-linux-commands-to-better-understand-your-server-type-of-networrk-tcp-udp-connections

For me as a GNU / Linux sysadmin it is intuitive to check on a server the number of established connections / connections in time_wait state and so on .

I will not explain why this is necessery as every system administrator out there who had a performance or network issues due to server / applications connection overload or have been a target of Denial of Service (DoS)
or Distributed Denial of Service attacks (DDoS)  
is well aware that a number of connections in different states such as SYN_ACK /  TIME_WAIT or ESTABLISHED state could be very nasty thing and could cause a productive application or Infrastructure service to be downed for some time causing from thousands of Euros to even millions to some bussinesses as well as some amount of data loss …

To prevent this therefore sysadmins should always take a look periodically on the Connection states on the adminned server (and in this number I say not only sys admins but DevOps guys who are deploying micro-services for a customer in the Cloud – yes I believe Richard Stallman is right here they're clouding your minds :).

Even though cloud services could provide a very high amount of Hardware (CPU / Memory / Storage) resources, often for custom applications migrating the application in the Cloud does not solve it's design faults or even problems on a purely classical system administration level.

 

1. Get a statistic for FIN_WAIT1, FOREIGN, SYNC_RECV, LAST_ACK, TIME_WAIT, LISTEN and ESTABLISHED  Connections on GNU / Linux

 

On GNU / Linux and other Linux like UNIXes the way to do it is to grep out the TCP / UDP connection type you need via netstat a very useful cmd in that case is:

 

root@pcfreak:~# netstat -nat | awk '{print $6}' | sort | uniq -c | sort -n
      1 established)
      1 FIN_WAIT1
      1 Foreign
      1 SYN_RECV
      3 LAST_ACK
      4 FIN_WAIT2
      8 TIME_WAIT
     45 LISTEN
    147 ESTABLISHED

 

2. Netstat 1 liner to Get only established and time_wait connections state 

 

Other ways to check only TCP ESTABLISHED connections on Linux I use frequently are:

 

root@pcfreak:~# netstat -etna|grep -i establi|wc -l
145

 

netstat-connection-types-statistics-linux-established-time-wait-check-count

Or to get whole list of connections including the ones who are about to be esatablished in FIN_WAIT2, TIME_WAIT, SYN_RECV state:

 

root@pcfreak:~# netstat -tupen |wc -l
164

 

3. Other Linux useful one liner commands to track your connection types
 

netstat -n -p | grep SYN_REC | sort -u

List out the all IP addresses involved instead of just count.

netstat -n -p | grep SYN_REC | awk '{print $5}' | awk -F: '{print $1}'

 

List all the unique IP addresses of the node that are sending SYN_REC connection status.

netstat -ntu | awk '{print $5}' | cut -d: -f1 | sort | uniq -c | sort -n

 

Use netstat command to calculate and count the number of connections each IP address makes to the server.

netstat -anp |grep 'tcp\|udp' | awk '{print $5}' | cut -d: -f1 | sort | uniq -c | sort -n

 

List count of number of connections the IPs are connected to the server using TCP or UDP protocol.

netstat -ntu | grep ESTAB | awk '{print $5}' | cut -d: -f1 | sort | uniq -c | sort -nr

 

Check on ESTABLISHED connections instead of all connections, and displays the connections count for each IP.

 

netstat -plan|grep :80|awk {'print $5'}|cut -d: -f 1|sort|uniq -c|sort -nk 1

 

Show and list IP address and its connection count that connect to port 80 on the server. Port 80 is used mainly by HTTP web page request.

Examples are taken from this nice blog post

 

4. Check the count of esatblished connections on M$ Windows

 

As I'm forced to optimize a couple of Microsoft Windows DNS servers which are really slow to resolve the
The logical question for me was how the Established and TIME_WAIT state connections then could be checked on Windows OS, after a quick investigation online I've come up with this:

 

C:\Users\admin> netstat -nao | find /i "estab" /c
78

 

netstat-check-number-of-established-ports-connections-windows
 

 

C:\Users\admin> netsatt -nao | find /i "time_wait" /c
333

 

 

If you're used to Linux watch command, then to do same on Windows OS (e.g. check the output of netstat) command every second
and print output use:

 

netstat –an 1 | find “3334”

 

Below commands will show stats for services listening on TCP port 3334

To find out which process on system sends packets to remote destination:

 

netstat –ano 1 | find “Dest_IP_Addr”

 

The -o parameter outputs the process ID (PID) responsible for the connection.
then if you need further you can find the respective process name with tasklist< cmd.
Another handy Windows netstat option is -b which will show EXE file running as long as
the related used DLL Libraries which use TCP / UDP .

Other useful netsatat Win example is to grep for a port and show all established connections for it with:

 

netstat –an 1 | find “8080” | find “ESTABLISHED”

 

5. Closure


Hopefully this article will give you some idea on what is eating your bandwidth connections or overloading your GNU / Linux – Windows systems. And will point you to the next in line logical thing to do optimization / tuning
settings to be made on your system for example if Linux with sysctl – see my previous relater article here

I'll be intested to hear from sysadm colleagoes for other useful ways to track connections perhaps with something like ss tool (a utility to investigate sockets).
Also any optimization hints that would cause servers less downtime and improve network / performance thouroughput is mostly welcome.

 

How to disown a process once it is running on Linux – old but useful trick

Thursday, December 20th, 2018

how-to-disown-a-shell-running-process-on-linux-trick

There is one very old but  gold useful UNIX / Linux trick, I remembered which will be interesting to share it's called  it is called disowning.


Lets say you run execution of a job an rsync job or a simple copy job of a very large file, but in the middle of the copy you remembered you need to do something else and thus want to switch back to shell (without opening a new ssh if on remote server) or a new console if on a local machine.
Then how can you background the copy process and move the process to the rest of long running process system list e.g. "disown" it from yourself so the process continues its job in the background just like of the rest of the backgrounded running processes on the system.

Here is the basic syntax of the disown command:
 

help disown
disown: disown [-h] [-ar] [jobspec …]
    By default, removes each JOBSPEC argument from the table of active jobs.
    If the -h option is given, the job is not removed from the table, but is
    marked so that SIGHUP is not sent to the job if the shell receives a
    SIGHUP.  The -a option, when JOBSPEC is not supplied, means to remove all
    jobs from the job table; the -r option means to remove only running jobs.

 

Here is a live example of what I meant by above lines and actual situation where disown comes super useful.

The 'disown' command/builtin (this is in bash), which will disassociate the process from the shell and not send the HUP signal to the process on exit.

root@linux:~# cp -rpf SomeReallyLargeFile1 SomeReallylargeFile2

[1]+  Stopped                 cp -i -r SomeReallyLargeFile SomeReallylargeFile2
root@linux:~#  bg %1
[1]+ cp -i -r SomeReallyLargeFile SomeReallylargeFile2 &
root@linux:~#  jobs
[1]+  Running                 cp -i -r testLargeFile largeFile2 &
root@linux:~# disown -h %1
root@linux:~# ps -ef |grep largeFile2
root      5790  5577  1 10:04 pts/3    00:00:00 cp -i -rpf SomeReallyLargeFile SomeReallylargeFile2
root      5824  5577  0 10:05 pts/3    00:00:00 grep largeFile2
root@linux:~#


Of course you can always use something like GNU screen (VT100/ ANSI Terminal screen manager) or tmux (terminal multiplexer) to detach the process but you will have to have run the screen  / tmux session in advance which you might haven't  yet as well as it is  required one of the 2 to be present on a servers and on many servers in complex client environments this might be missing and hard to install (such as server is behind a firewall DMZ-ed (Demilitirezed Zoned) network and no way to install extra packages), the disown command makes sense.

Another useful old tip, that new Linux users might not konw is the nohup command (which runs a command immune to hangups with output to a non-tty), nohup's main use is if you want to run process in background with (ampersand) from bash / zsh / tcsh etc. and keep the backgrounded process running even once you've exited the active shell, to do so run the proc background as follows:
 

$ nohup command-to-exec &

 

Hope this helps someone, Enjoy!

 

Automatic network restart and reboot Linux server script if ping timeout to gateway is not responding as a way to reduce connectivity downtimes

Monday, December 10th, 2018

automatic-server-network-restart-and-reboot-script-if-connection-to-server-gateway-inavailable-tux-penguing-ascii-art-bin-bash

Inability of server to come back online server automaticallyafter electricity / network outage

These days my home server  is experiencing a lot of issues due to Electricity Power Outages, a construction dig operations to fix / change waterpipe tubes near my home are in action and perhaps the power cables got ruptered by the digger machine.
The effect of all this was that my server networking accessability was affected and as I didn't have network I couldn't access it remotely anymore at a certain point the electricity was restored (and the UPS charge could keep the server up), however the server accessibility did not due restore until I asked a relative to restart it or under a more complicated cases where Tech aquanted guy has to help – Alexander (Alex) a close friend from school years check his old site here – alex.pc-freak.net helps a lot.to restart the machine physically either run a quick restoration commands on root TTY terminal or generally do check whether default router is reachable.

This kind of Pc-Freak.net downtime issues over the last month become too frequent (the machine was down about 5 times for 2 to 5 hours and this was too much (and weirdly enough it was not accessible from the internet even after electricity network was restored and the only solution to that was a physical server restart (from the Power Button).

To decrease the number of cases in which known relatives or friends has to  physically go to the server and restart it, each time after network or electricity outage I wrote a small script to check accessibility towards Default defined Network Gateway for my server with few ICMP packages sent with good old PING command
and trigger a network restart and system reboot
(in case if the network restart does fail) in a row.

1. Create reboot-if-nwork-is-downsh script under /usr/sbin or other dir

Here is the script itself:

 

#!/bin/sh
# Script checks with ping 5 ICMP pings 10 times to DEF GW and if so
# triggers networking restart /etc/inid.d/networking restart
# Then does another 5 x 10 PINGS and if ping command returns errors,
# Reboots machine
# This script is useful if you run home router with Linux and you have
# electricity outages and machine doesn't go up if not rebooted in that case

GATEWAY_HOST='192.168.0.1';

run_ping () {
for i in $(seq 1 10); do
    ping -c 5 $GATEWAY_HOST
done

}

reboot_f () {
if [ $? -eq 0 ]; then
        echo "$(date "+%Y-%m-%d %H:%M:%S") Ping to $GATEWAY_HOST OK" >> /var/log/reboot.log
    else
    /etc/init.d/networking restart
        echo "$(date "+%Y-%m-%d %H:%M:%S") Restarted Network Interfaces:" >> /tmp/rebooted.txt
    for i in $(seq 1 10); do ping -c 5 $GATEWAY_HOST; done
    if [ $? -eq 0 ] && [ $(cat /tmp/rebooted.txt) -lt ‘5’ ]; then
         echo "$(date "+%Y-%m-%d %H:%M:%S") Ping to $GATEWAY_HOST FAILED !!! REBOOTING." >> /var/log/reboot.log
        /sbin/reboot

    # increment 5 times until stop
    [[ -f /tmp/rebooted.txt ]] || echo 0 > /tmp/rebooted.txt
    n=$(< /tmp/rebooted.txt)
        echo $(( n + 1 )) > /tmp/rebooted.txt
    fi
    # if 5 times rebooted sleep 30 mins and reset counter
    if [ $(cat /tmprebooted.txt) -eq ‘5’ ]; then
    sleep 1800
        cat /dev/null > /tmp/rebooted.txt
    fi
fi

}
run_ping;
reboot_f;

You can download a copy of reboot-if-nwork-is-down.sh script here.

As you see in script successful runs  as well as its failures are logged on server in /var/log/reboot.log with respective timestamp.
Also a counter to 5 is kept in /tmp/rebooted.txt, incremented on each and every script run (rebooting) if, the 5 times increment is matched

a sleep is executed for 30 minutes and the counter is being restarted.
The counter check to 5 guarantees the server will not get restarted if access to Gateway is not continuing for a long time to prevent the system is not being restarted like crazy all time.
 

2. Create a cron job to run reboot-if-nwork-is-down.sh every 15 minutes or so 

I've set the script to re-run in a scheduled (root user) cron job every 15 minutes with following  job:

To add the script to the existing cron rules without rewriting my old cron jobs and without tempering to use cronta -u root -e (e.g. do the cron job add in a non-interactive mode with a single bash script one liner had to run following command:

 

{ crontab -l; echo "*/15 * * * * /usr/sbin/reboot-if-nwork-is-down.sh 2>&1 >/dev/null; } | crontab –


I know restarting a server to restore accessibility is a stupid practice but for home-use or small client servers with unguaranteed networks with a cheap Uninterruptable Power Supply (UPS) devices it is useful.

Summary

Time will show how efficient such a  "self-healing script practice is.
Even though I'm pretty sure that even in a Corporate businesses and large Public / Private Hybrid Clouds where access to remote mounted NFS / XFS / ZFS filesystems are failing a modifications of the script could save you a lot of nerves and troubles and unhappy customers / managers screaming at you on the phone 🙂


I'll be interested to hear from others who have a better  ideas to restore ( resurrect ) access to inessible Linux server after an outage.?
 

Create SFTP CHROOT Jail User for data transfer to better Linux shared web hosting server security

Monday, December 3rd, 2018

Adding user SFTP access to a Linux system is often required and therefore a must for multi users or web hosting environments it is an absolute requirement to have SFTP  user space separation ( isolation ) out of the basic Linux system environment this is done using a fake CHROOT Jail.

Purpose of this article is to show how to create SFTP Chroot JAIL in few easy configurations.

By isolating each user into his own space you will protect the users to not eventually steal or mistakenly leak information such as user credentials / passwords etc.

Besides that it is useful to restrict the User to his own File / Web Space to have granted only access to Secure FTP (SFTP) only and not SSH login access and togheter with the chroot jail environment to protect your server from being attempted to be hacked (rooted / exploited) through some (0day) zero-day kernel 1337 vulnerability.

1. Setup Chrooted file system and do the bind mount in /etc/fstab
 

# chown root:root /mnt/data/share
# chmod 755 /mnt/data/share
# mkdir -p /sftp/home
# mount -o bind /mnt/data/share /sftp/home

Next add to /etc/fstab (e.g. vim /etc/fstab) and add following line:
 

/mnt/data/share /sftp/home  none   bind   0   0


To mount it next:
 

# mount -a


/mnt/data/share is a mounted HDD in my case but could be any external attached storage

 

2. Create User and sftpgroup group and add your new SFTP Jailed user accounts to it

To achieve SFTP only CHROOT Jail environment you need some UNIX accounts new group created such as sftpgroup and use it to assign proper ownership / permissions to newly added SFTP restricted accounts.
 

# groupadd sftpgroup


Once the group exists, next step is to create the desired username / usernames with useradd command and assign it to sftpgroup:

 

# adduser sftp-account1 -s /sbin/nologin -d /sftp/home
# passwd sftp-account1

 

usermod -G sftpgroup sftp-account1


Above both commands could be also done in one line with adduser

 

# adduser sftp-account1 -g sftpgroup -s /sbin/nologin -d /sftp/home

Note the /sbin/nologin which is set to prevent SSH logins but still allow access via sftp / scp data transfer clients Once the user exists it is a good idea to prepare the jailed environment under a separate directory under root File system system lets say in /sftp/home/

3. Set proper permissions to User chrooted /home folder

# mkdir -p /sftp/home
# mkdir /sftp/home/sftp-account1
# chown root:root /sftp/
# chown sftp-account1:sftpgroup /sftp/home/sftp-account1

For each new created uesr (in this case sftp-account1) make sure the permissions are properly set to make the files readable only by the respective user.

# chmod 700 -R /sftp/home/sftp-account1

For every next created user don't forget to do the same 3. Modify SSHD configuration file to add Chroot match rules Edit /etc/ssh/sshd_config file and to the end of it add below configuration:

# vim /etc/ssh/sshd_config
Subsystem sftp internal-sftp     
Match Group sftpgroup   
ChrootDirectory /sftp/home   
ForceCommand internal-sftp   
X11Forwarding no   
AllowTcpForwarding no


Restart sshd to make the new settings take effect, to make sure you don't ed up with no access (if it is a remote server) run the sshd daemon on a secondary port like so:
 

# /usr/sbin/sshd -p 2208 &

Then restart sshd – if it is old Linux with Init V support

# /etc/init.d/sshd restart

– For systemd Linux systems

# systemctl restart sshd


4. Verify Username (sftp-account1) could login only via SFTP and his environment is chrooted

 

ssh sftp-account1@pc-freak.net

This service allows sftp connections only.
Connection to 83.228.93.76 closed.

 

sftp sftp-account1@pc-freak.net Connected to 83.228.93.76. sftp>


5. Closure

The quick summary of What we have achieved with below is:

restrict Linux users from having no /bin/shell access but still have Secure FTP copy in few steps to summarize them

a. create new user and group for SFTP chrooted restricted access only
b. set proper permissions to make folder accessible only by user itself
c. added necessery sshd config and restarted sshd to make it working d. tested configuration

This short guide was based on documentation on Arch Linux's wiki SFTP chroot you can check it here.

Prevent rsync cronjob to run multiple times via cronjob on Linux

Wednesday, November 21st, 2018

prevent-rsync-rsync-to-run-multiple-times-via-cronjob-on-linux

Today I had a report of a server whose Load Avarage keeps at the high level of 86, the machine runs on a bare metal rock solid hardware and even with such high Loads of the kernel it runs fine, but due to the I/O overhead the SANs red from a remote NetApp storage device started to be sluggish and hence it needed to be reviewed, thus I jumped in via the hop station (jump host) into the server.
 

1. Short investation on root cause for high server load


After a short investigation, I've found an rsync job set by someone on a cron job to be routinely run every 30 minutes, thus the old scheduled rsync, which seemed to run multiple times on the server (about 50 processes) of same rsync (file system synchronization was running) and as expected the storage was saddled with mutiple Input / Output requests.

The root cron job was like that:
 

server:~# crontab -u root -l |grep -i rsync
/usr/bin/rsync -ax /var/www/htdocs/directory_to_synchronize / /srv/www/synch_back/directory_to_synchrnize


A process list showed the following high number of running mirrored rsyncs:

 

server:~# ps axuwwf | grep -i rsync | wc -l
80


 

2. The Fix – Set Rsync to only via cron only in case if it is not already running in background


In order to fix it, I had to kill all current running rsync (here luckily only same single instance of rsync was running, but generally I was cautious to check no other rsync jobs are running – otherwise I would have mistakenly killed some other rsync job ongoing …)

Then I set the following new cron job one liner quick shell script that does the job to assign a pid file that is created before rsync and deleted after rsync completion.
 

if [ ! -e /tmp/repo_dba_sync.lock ]; then touch /tmp/repo_dba_sync.lock; /usr/bin/rsync -ax /var/www/htdocs/directory_to_synchronize / /srv/www/synch_back/directory_to_synchrnize ; trap 'rm -f /tmp/repo_dba_sync.lock; fi' EXIT  >/dev/null 2>&1


The cron job looked like so:

 

*/30 * * * * if [ ! -e /tmp/repo_dba_sync.lock ]; then touch /tmp/repo_dba_sync.lock; /usr/bin/rsync -ax /var/www/htdocs/directory_to_synchronize / /srv/www/synch_back/directory_to_synchrnize ; trap 'rm -f /tmp/repo_dba_sync.lock; fi'  EXIT >/dev/null 2>&1

Just in case if you're wondering
a trap should be used to verify that the lock file is removed when the script is exited for any reason.
This way the lock file will be removed even if the script exits before the end of the script.

An alternative and more simple ways to do it is via:
 

pgrep rsync > /dev/null || rsync -ax /var/www/htdocs/directory_to_synchronize / /srv/www/synch_back/directory_to_synchrnize

 

Or if you don't want to use bash's:
 

if []; then; fi


condition but still use a file lock the flock command can be used like so:
 

flock -n lock_file -c "rsync …"

Create and Configure SSL bundle file for GoGetSSL issued certificate in Apache Webserver on Linux

Saturday, November 3rd, 2018

gogetssl-install-certificate-on-linux-howto-sslcertificatechainfile-obsolete

I had a small task to configure a new WildCard SSL for domains on a Debian GNU / Linux Jessie running Apache 2.4.25.

The official documentation on how to install the SSL certificate on Linux given by GoGetSSL (which is by COMODO was obsolete as of time of writting this article and suggested as install instructions:
 

SSLEngine on
SSLCertificateKeyFile /etc/ssl/ssl.key/server.key
SSLCertificateFile /etc/ssl/ssl.crt/yourDomainName.crt
SSLCertificateChainFile /etc/ssl/ssl.crt/yourDomainName.ca-bundle


Adding such configuration to domain Vhost and testing with apache2ctl spits an error like:

 

root@webserver:~# apache2ctl configtest
AH02559: The SSLCertificateChainFile directive (/etc/apache2/sites-enabled/the-domain-name-ssl.conf:17) is deprecated, SSLCertificateFile should be used instead
Syntax OK

 


To make issued GoGetSSL work with Debian Linux, hence, here is the few things done:

The files issued by Gogetssl.COM were the following:

 

AddTrust_External_CA_Root.crt
COMODO_RSA_Certification_Authority.crt
the-domain-name.crt


The webserver had already SSL support via mod_ssl Apache module, e.g.:

 

root@webserver:~# ls -al /etc/apache2/mods-available/*ssl*
-rw-r–r– 1 root root 3112 окт 21  2017 /etc/apache2/mods-available/ssl.conf
-rw-r–r– 1 root root   97 сеп 19  2017 /etc/apache2/mods-available/ssl.load
root@webserver:~# ls -al /etc/apache2/mods-enabled/*ssl*
lrwxrwxrwx 1 root root 26 окт 19  2017 /etc/apache2/mods-enabled/ssl.conf -> ../mods-available/ssl.conf
lrwxrwxrwx 1 root root 26 окт 19  2017 /etc/apache2/mods-enabled/ssl.load -> ../mods-available/ssl.load


For those who doesn't have mod_ssl enabled, to enable it quickly run:

 

# a2enmod ssl


The VirtualHost used for the domains had Apache config as below:

 

 

 

NameVirtualHost *:443

<VirtualHost *:443>
    ServerAdmin support@the-domain-name.com
    ServerName the-domain-name.com
    ServerAlias *.the-domain-name.com the-domain-name.com

    DocumentRoot /home/the-domain-namecom/www
    SSLEngine On
#    <Directory />
#        Options FollowSymLinks
#        AllowOverride None
#    </Directory>
    <Directory /home/the-domain-namecom/www>
        Options Indexes FollowSymLinks MultiViews
        AllowOverride None
        Include /home/the-domain-namecom/www/htaccess_new.txt
        Order allow,deny
        allow from all
    </Directory>

    ScriptAlias /cgi-bin/ /usr/lib/cgi-bin/
    <Directory "/usr/lib/cgi-bin">
        AllowOverride None
        Options +ExecCGI -MultiViews +SymLinksIfOwnerMatch
        Order allow,deny
        Allow from all
    </Directory>

    ErrorLog ${APACHE_LOG_DIR}/error.log

    # Possible values include: debug, info, notice, warn, error, crit,
    # alert, emerg.
    LogLevel warn

    CustomLog ${APACHE_LOG_DIR}/access.log combined

#    Alias /doc/ "/usr/share/doc/"
#   <Directory "/usr/share/doc/">
#       Options Indexes MultiViews FollowSymLinks
#       AllowOverride None
#       Order deny,allow
#       Deny from all
#       Allow from 127.0.0.0/255.0.0.0 ::1/128
#   </Directory>
SSLCertificateKeyFile /etc/apache2/ssl/the-domain-name.com.key
SSLCertificateFile /etc/apache2/ssl/chain.crt

 

</VirtualHost>

The config directives enabling and making the SSL actually work are:
 

SSLEngine On
SSLCertificateKeyFile /etc/apache2/ssl/the-domain-name.com.key
SSLCertificateFile /etc/apache2/ssl/chain.crt

 

The chain.crt file is actually a bundle file containing a bundle of the gogetssl CA_ROOT and RSA_Certification_Authority 3 files, to prepare that file, I've used bundle.sh small script found on serverfault.com here I've made a mirror of bundle.sh on pc-freak.net here   the script content is as follows:

To prepare the chain.crt  bundle, I ran:

 

sh create-ssl-bundle.sh _iq-test_cc.crt chain.crt
sh create-ssl-bundle.sh _iq-test_cc.crt >chain.crt
sh create-ssl-bundle.sh COMODO_RSA_Certification_Authority.crt >> chain.crt
sh create-ssl-bundle.sh bundle.sh AddTrust_External_CA_Root.crt >> chain.crt


Then I copied the file to /etc/apache2/ssl together with the-domain-name.com.key file earlier generated using openssl command earlier explained in my article how to install RapidSSL certificate on Linux

/etc/apache2/ssl was not previously existing (on Debian Linux), so to create it:

 

root@webserver:~# mkdir /etc/apache2/ssl
root@webserver:~# ls -al /etc/apache2/ssl/chain.crt
-rw-r–r– 1 root root 20641 Nov  2 12:27 /etc/apache2/ssl/chain.crt
root@webserver:~# ls -al /etc/apache2/ssl/the-domain-name.com.key
-rw-r–r– 1 root root 6352 Nov  2 20:35 /etc/apache2/ssl/the-domain-name.com.key

 

As I needed to add the SSL HTTPS configuration for multiple domains, further on I've wrote and used a tiny shell script add_new_vhost.sh which accepts as argument the domain name I want to add. The script works with a sample Skele (Template) file, which is included in the script itself and can be easily modified for the desired vhost config.
To add my multiple domains, I've used the script as follows:
 

sh add_new_vhost.sh add-new-site-domain.com
sh add_new_vhost.sh add-new-site-domain1.com


etc.

Here is the complete script as well:

 

#!/bin/sh
# Shell script to add easily new domains for virtual hosting on Debian machines
# arg1 should be a domain name
# This script takes the domain name which you type as arg1 uses it and creates
# Docroot / cgi-bin directory for the domain, create seperate site's apache log directory
# then takes a skele.com file and substitutes a skele.com with your domain name and directories
# This script's aim is to easily enable sysadmin to add new domains in Debian
sites_base_dir=/var/www/jail/home/www-data/sites/;
# the directory where the skele.com file is
skele_dir=/etc/apache2/sites-available;
# base directory where site log dir to be created
cr_sep_log_file_d=/var/log/apache2/sites;
# owner of the directories
username='www-data';
# read arg0 and arg1
arg0=$0;
arg1=$1;
if [[ -z $arg1 ]]; then
echo "Missing domain name";
exit 1;
fi

 

# skele template
echo "#
#  Example.com (/etc/apache2/sites-available/www.skele.com)
#
<VirtualHost *>
        ServerAdmin admin@design.bg
        ServerName  skele.com
        ServerAlias www.skele.com


        # Indexes + Directory Root.
        DirectoryIndex index.php index.htm index.html index.pl index.cgi index.phtml index.jsp index.py index.asp

        DocumentRoot /var/www/jail/home/www-data/sites/skelecom/www/docs
        ScriptAlias /cgi-bin "/var/www/jail/home/www-data/sites/skelecom/cgi-bin"
        
        # Logfiles
        ErrorLog  /var/log/apache2/sites/skelecom/error.log
        CustomLog /var/log/apache2/sites/skelecom/access.log combined
#       CustomLog /dev/null combined
      <Directory /var/www/jail/home/www-data/sites/skelecom/www/docs/>
                Options FollowSymLinks MultiViews -Includes
                AllowOverride None
                Order allow,deny
                allow from all
                # This directive allows us to have apache2's default start page
                # in /apache2-default/, but still have / go to the right place
#               RedirectMatch ^/$ /apache2-default/
        </Directory>

        <Directory /var/www/jail/home/www-data/sites/skelecom/www/docs/>
                Options FollowSymLinks ExecCGI -Includes
                AllowOverride None
                Order allow,deny
                allow from all
        </Directory>

</VirtualHost>
" > $skele_dir/skele.com;

domain_dir=$(echo $arg1 | sed -e 's/\.//g');
new_site_dir=$sites_base_dir/$domain_dir/www/docs;
echo "Creating $new_site_dir";
mkdir -p $new_site_dir;
mkdir -p $sites_base_dir/cgi-bin;
echo "Creating sites's Docroot and CGI directory";
chown -R $username:$username $new_site_dir;
chown -R $username:$username $sites_base_dir/cgi-bin;
echo "Creating site's Log files Directory";
mkdir -p $cr_sep_log_file_d/$domain_dir;
echo "Creating sites's VirtualHost file and adding it for startup";
sed -e "s#skele.com#$arg1#g" -e "s#skelecom#$domain_dir#g" $skele_dir/skele.com >> $skele_dir/$arg1;
ln -sf $skele_dir/$arg1 /etc/apache2/sites-enabled/;
echo "All Completed please restart apache /etc/init.d/apache restart to Load the new virtual domain";

# Date Fri Jan 11 16:27:38 EET 2008


Using the script saves a lot of time to manually, copy vhost file and then edit it to change ServerName directive, for vhosts whose configuration is identical and only the ServerName listener has to change, it is perfect to create all necessery domains, I've created a simple text file with each of the domains and run it in a loop:
 

while :; do sh add_new_vhost.sh $i; done < domain_list.txt
 

 

How to install custom Font files on Linux with font-viewer, fc-cache, font-manager – Install Church Slavonic fonts on GNU / Linux

Saturday, October 27th, 2018

install-custom-fonts-on-linux-easily-linux-libertine-alphabet-typography-font-u-shaped

If you're regularly using GIMP for Image Editing or LibreOffice for Office stuff or any other program that you might use to add / edit fonts, then you certainly will come to a point wondering how to manually add new .TTF (TrueType Fonts) or .AFM .PBM.
Using apt-get  install tool multiple fonts can be searched in Debian / Ubuntu repos, but adding a third party fonts provided by some random graphics designer is a necessity.

For example earlier I've blogged on What is Church Slavonic and collected a large collection pack of Church Slavonic fonts ready which I used to install at that time on a Windows 7 PC, question comes how this fonts once downloaded can be added / installed so Xorg running and Font rendering programs on GNU / Linux are aware of the new downloaded fonts and can be used in various programs?

gnome-font-viewer-program-gnu-linux-screenshot

The easiest way to install font in Linux is to Double click over the new font you want to install that would run Font Viewer program in GNOME GUI environment when clicked over fonts the  gnome-font-viewer) opens, however it is tedicious task to install in that manner if you have to instal some new 100 or 200 fonts by clicking over each.

To make the new downloaded pack of fonts on a user level it is as simple as downloading the number of fonts and placing them in $HOME/fonts folder e.g. in ~/.fonts (in some distributions placing the new fonts under ~/usr/local/share/fonts makes them available for use on next Xsession login.

To make new fonts available system-wide (e.g. for all existing or logged in in Xorg) users it is as simple as copying all new font files (TTF, PFM, PFB etc.) you'd like to add to /usr/local/share/fonts:
 

# cp -rpf ~/Desktop/fonts-folder/* /usr/local/share/fonts/


And run fs-cache to rescan and build new font cache files based on the fonts copied

 

 fc-cache -f -v


To check whether the new fonts are present you can list all available fonts with:

 

fc-list

 

/usr/share/fonts/truetype/lato/Lato-Medium.ttf: Lato,Lato Medium:style=Medium,Regular
/usr/share/fonts/truetype/msttcorefonts/comicbd.ttf: Comic Sans MS:style=Bold,Negreta,tučné,fed,Fett,Έντονα,Negrita,Lihavoitu,Gras,Félkövér,Grassetto,
Vet,Halvfet,Pogrubiony,Negrito,Полужирный,Fet,Kalın,Krepko,Lodia
/usr/share/fonts/truetype/lato/Lato-SemiboldItalic.ttf: Lato,
Lato Semibold:style=Semibold Italic,Italic
/usr/local/share/fonts/TriKUcs.pfb: Triodion kUcs:style=Regular
/usr/share/fonts/truetype/dejavu/DejaVuSerif-Bold.ttf: DejaVu Serif:style=Bold
/usr/local/share/fonts/OglUcs8.ttf: Oglavie Ucs:style=Regular
/usr/share/fonts/truetype/noto/NotoSansThai-Regular.ttf: Noto Sans Thai:style=Regular
/usr/local/share/fonts/freefont-20080323/FreeSerifBold.ttf: FreeSerif:style=Bold,polkrepko
/usr/local/share/fonts/TITUSEN.TTF: Titus SyriacEstrangelo:style=Regular
/usr/local/share/fonts/feofanucs.ttf: Feofan Ucs:style=Regular
/usr/local/share/fonts/OstgDSoIEUcs8.ttf: Ostrog\-Dol ieUcs:style=SpacedOut
/usr/share/fonts/truetype/dejavu/DejaVuSansMono.ttf: DejaVu Sans Mono:style=Book
/usr/share/fonts/truetype/noto/NotoSansCypriot-Regular.ttf: Noto Sans Cypriot:style=Regular
/usr/local/share/fonts/ZlatUcs.pfb: Zlatoust Ucs:style=Regular
..
.

 


To look for a certain font supposed to be installed run cmd:

 

fc-list|grep -i "Times New Roman"
/usr/share/fonts/truetype/msttcorefonts/Times_New_Roman.ttf: Times New Roman:style=Regular,Normal,obyčejné,Standard,Κανονικά,
Normaali,Normál,Normale,Standaard,Normalny,Обычный,Normálne,Navadno,thường,Arrunta

 

fc-list|grep -i "slavonic"
/usr/local/share/fonts/TITUSN__.TTF: Titus Slavonic:style=Normal

 


gnome-font-viewer-program-gnu-linux-screenshot

Another good tool for GNOME users is font-manager if you don't have it already installed:

 

apt-get install font-manager


One of the cool things about it is it can show you Licensing of each of system installed fonts the full list of font character sets and could visualize you different pixel font sizes in the so called "waterfall" font view.