Posts Tagged ‘init’

Fix Out of inodes on Postfix Linux Mail Cluster. How to clean up filesystem running out of Inodes, Filesystem inodes on partition is 100% full

Wednesday, August 25th, 2021

Inode_Entry_inode-table-content

Recently we have faced a strange issue with with one of our Clustered Postfix Mail servers (the cluster is with 2 nodes that each has configured Postfix daemon mail servers (running on an OpenVZ virtualized environment).
A heartbeat that checks liveability of clusters and switches nodes in case of one of the two gets broken due to some reason), pretty much a standard SMTP cluster.

So far so good but since the cluster is a kind of abondoned and is pretty much legacy nowadays and used just for some Monitoring emails from different scripts and systems on servers, it was not really checked thoroughfully for years and logically out of sudden the alarming email content sent via the cluster stopped working.

The normal sysadmin job here  was to analyze what is going on with the cluster and fix it ASAP. After some very basic analyzing we catched the problem is caused by a  "inodes full" (100% of available inodes were occupied) problem, e.g. file system run out of inodes on both machines perhaps due to a pengine heartbeat process  bug  leading to producing a high number of .bz2 pengine recovery archive files stored in /var/lib/pengine>

Below are the few steps taken to analyze and fix the problem.
 

1. Finding out about the the system run out of inodes problem


After logging on to system and not finding something immediately is wrong with inodes, all I can see from crm_mon is cluster was broken.
A plenty of emails were left inside the postfix mail queue visible with a standard command

[root@smtp1: ~ ]# postqueue -p

It took me a while to find ot the problem is with inodes because a simple df -h  was showing systems have enough space but still cluster quorum was not complete.
A bit of further investigation led me to a  simple df -i reporting the number of inodes on the local filesystems on both our SMTP1 and SMTP2 got all occupied.

[root@smtp1: ~ ]# df -i
Filesystem            Inodes   IUsed   IFree IUse% Mounted on
/dev/simfs            500000   500000  0   100% /
none                   65536      61   65475    1% /dev

As you can see the number of inodes on the Virual Machine are unfortunately depleted

Next step was to check directories occupying most inodes, as this is the place from where files could be temporary moved to a remote server filesystem or moved to another partition with space on a server locally attached drives.
Below command gives an ordered list with directories locally under the mail root filesystem / and its respective occupied number files / inodes,
the more files under a directory the more inodes are being occupied by the files on the filesystem.

 

run-out-if-inodes-what-is-inode-find-out-which-filesystem-or-directory-eating-up-all-your-system-inodes-linux_inode_diagram.gif
1.1 Getting which directory consumes most of the inodes on the systems

 

[root@smtp1: ~ ]# { find / -xdev -printf '%h\n' | sort | uniq -c | sort -k 1 -n; } 2>/dev/null
….
…..

…….
    586 /usr/lib64/python2.4
    664 /usr/lib64
    671 /usr/share/man/man8
    860 /usr/bin
   1006 /usr/share/man/man1
   1124 /usr/share/man/man3p
   1246 /var/lib/Pegasus/prev_repository_2009-03-10-1236698426.308128000.rpmsave/root#cimv2/classes
   1246 /var/lib/Pegasus/prev_repository_2009-05-18-1242636104.524113000.rpmsave/root#cimv2/classes
   1246 /var/lib/Pegasus/prev_repository_2009-11-06-1257494054.380244000.rpmsave/root#cimv2/classes
   1246 /var/lib/Pegasus/prev_repository_2010-08-04-1280907760.750543000.rpmsave/root#cimv2/classes
   1381 /var/lib/Pegasus/prev_repository_2010-11-15-1289811714.398469000.rpmsave/root#cimv2/classes
   1381 /var/lib/Pegasus/prev_repository_2012-03-19-1332151633.572875000.rpmsave/root#cimv2/classes
   1398 /var/lib/Pegasus/repository/root#cimv2/classes
   1696 /usr/share/man/man3
   400816 /var/lib/pengine

Note, the above command orders the files from bottom to top order and obviosuly the bottleneck directory that is over-eating Filesystem inodes with an exceeding amount of files is
/var/lib/pengine
 

2. Backup old multitude of files just in case of something goes wrong with the cluster after some files are wiped out


The next logical step of course is to check what is going on inside /var/lib/pengine just to find a very ,very large amount of pe-input-*NUMBER*.bz2 files were suddenly produced.

 

[root@smtp1: ~ ]# ls -1 pe-input*.bz2 | wc -l
 400816


The files are produced by the pengine process which is one of the processes that is controlling the heartbeat cluster state, presumably it is done by running process:

[root@smtp1: ~ ]# ps -ef|grep -i pengine
24        5649  5521  0 Aug10 ?        00:00:26 /usr/lib64/heartbeat/pengine


Hence in order to fix the issue, to prevent some inconsistencies in the cluster due to the file deletion,  copied the whole directory to another mounted parition (you can mount it remotely with sshfs for example) or use a local one if you have one:

[root@smtp1: ~ ]# cp -rpf /var/lib/pengine /mnt/attached_storage


and proceeded to clean up some old multitde of files that are older than 2 years of times (720 days):


3. Clean  up /var/lib/pengine files that are older than two years with short loop and find command

 


First I made a list with all the files to be removed in external text file and quickly reviewed it by lessing it like so

[root@smtp1: ~ ]#  cd /var/lib/pengine
[root@smtp1: ~ ]# find . -type f -mtime +720|grep -v pe-error.last | grep -v pe-input.last |grep -v pe-warn.last -fprint /home/myuser/pengine_older_than_720days.txt
[root@smtp1: ~ ]# less /home/myuser/pengine_older_than_720days.txt


Once reviewing commands I've used below command to delete the files you can run below command do delete all older than 2 years that are different from pe-error.last / pe-input.last / pre-warn.last which might be needed for proper cluster operation.

[root@smtp1: ~ ]#  for i in $(find . -type f -mtime +720 -exec echo '{}' \;|grep -v pe-error.last | grep -v pe-input.last |grep -v pe-warn.last); do echo $i; done


Another approach to the situation is to simply review all the files inside /var/lib/pengine and delete files based on year of creation, for example to delete all files in /var/lib/pengine from 2010, you can run something like:
 

[root@smtp1: ~ ]# for i in $(ls -al|grep -i ' 2010 ' | awk '{ print $9 }' |grep -v 'pe-warn.last'); do rm -f $i; done


4. Monitor real time inodes freeing

While doing the clerance of old unnecessery pengine heartbeat archives you can open another ssh console to the server and view how the inodes gets freed up with a command like:

 

# check if inodes is not being rapidly decreased

[root@csmtp1: ~ ]# watch 'df -i'


5. Restart basic Linux services producing pid files and logs etc. to make then workable (some services might not be notified the inodes on the Hard drive are freed up)

Because the hard drive on the system was full some services started to misbehaving and /var/log logging was impacted so I had to also restart them in our case this is the heartbeat itself
that  checks clusters nodes availability as well as the logging daemon service rsyslog

 

# restart rsyslog and heartbeat services
[root@csmtp1: ~ ]# /etc/init.d/heartbeat restart
[root@csmtp1: ~ ]# /etc/init.d/rsyslog restart

The systems had been a data integrity legacy service samhain so I had to restart this service as well to reforce the /var/log/samhain log file to again continusly start writting data to HDD.

# Restart samhain service init script 
[root@csmtp1: ~ ]# /etc/init.d/samhain restart


6. Check up enough inodes are freed up with df

[root@smtp1 log]# df -i
Filesystem Inodes IUsed IFree IUse% Mounted on
/dev/simfs 500000 410531 19469 91% /
none 65536 61 65475 1% /dev


I had to repeat the same process on the second Postfix cluster node smtp2, and after all the steps like below check the status of smtp2 node and the postfix queue, following same procedure made the second smtp2 cluster member as expected 🙂

 

7. Check the cluster node quorum is complete, e.g. postfix cluster is operating normally

 

# Test if email cluster is ok with pacemaker resource cluster manager – lt-crm_mon
 

[root@csmtp1: ~ ]# crm_mon -1
============
Last updated: Tue Aug 10 18:10:48 2021
Stack: Heartbeat
Current DC: smtp2.fqdn.com (bfb3d029-89a8-41f6-a9f0-52d377cacd83) – partition with quorum
Version: 1.0.12-unknown
2 Nodes configured, unknown expected votes
4 Resources configured.
============

Online: [ smtp2.fqdn.com smtp1.fqdn.com ]

failover-ip (ocf::heartbeat:IPaddr2): Started csmtp1.ikossvan.de
Clone Set: postfix_clone
Started: [ smtp2.fqdn.com smtp1fqdn.com ]
Clone Set: pingd_clone
Started: [ smtp2.fqdn.com smtp1.fqdn.com ]
Clone Set: mailto_clone
Started: [ smtp2.fqdn.com smtp1.fqdn.com ]

 

8.  Force resend a few hundred thousands of emails left in the email queue


After some inodes gets freed up due to the file deletion, i've reforced a couple of times the queued mail servers to be immediately resent to remote mail destinations with cmd:

 

# force emails in queue to be resend with postfix

[root@smtp1: ~ ]# sendmail -q


– It was useful to watch in real time how the queued emails are quickly decreased (queued mails are successfully sent to destination addresses) with:

 

# Monitor  the decereasing size of the email queue
[root@smtp1: ~ ]# watch 'postqueue -p|grep -i '@'|wc -l'

How to build Linux logging bash shell script write_log, logging with Named Pipe buffer, Simple Linux common log files logging with logger command

Monday, August 26th, 2019

how-to-build-bash-script-for-logging-buffer-named-pipes-basic-common-files-logging-with-logger-command

Logging into file in GNU / Linux and FreeBSD is as simple as simply redirecting the output, e.g.:
 

echo "$(date) Whatever" >> /home/hipo/log/output_file_log.txt


or with pyping to tee command

 

echo "$(date) Service has Crashed" | tee -a /home/hipo/log/output_file_log.txt


But what if you need to create a full featured logging bash robust shell script function that will run as a daemon continusly as a background process and will output
all content from itself to an external log file?
In below article, I've given example logging script in bash, as well as small example on how a specially crafted Named Pipe buffer can be used that will later store to a file of choice.
Finally I found it interesting to mention few words about logger command which can be used to log anything to many of the common / general Linux log files stored under /var/log/ – i.e. /var/log/syslog /var/log/user /var/log/daemon /var/log/mail etc.
 

1. Bash script function for logging write_log();


Perhaps the simplest method is just to use a small function routine in your shell script like this:
 

write_log()
LOG_FILE='/root/log.txt';
{
  while read text
  do
      LOGTIME=`date "+%Y-%m-%d %H:%M:%S"`
      # If log file is not defined, just echo the output
      if [ “$LOG_FILE” == “” ]; then
    echo $LOGTIME": $text";
      else
        LOG=$LOG_FILE.`date +%Y%m%d`
    touch $LOG
        if [ ! -f $LOG ]; then echo "ERROR!! Cannot create log file $LOG. Exiting."; exit 1; fi
    echo $LOGTIME": $text" | tee -a $LOG;
      fi
  done
}

 

  •  Using the script from within itself or from external to write out to defined log file

 

echo "Skipping to next copy" | write_log

 

2. Use Unix named pipes to pass data – Small intro on what is Unix Named Pipe.


Named Pipe –  a named pipe (also known as a FIFO (First In First Out) for its behavior) is an extension to the traditional pipe concept on Unix and Unix-like systems, and is one of the methods of inter-process communication (IPC). The concept is also found in OS/2 and Microsoft Windows, although the semantics differ substantially. A traditional pipe is "unnamed" and lasts only as long as the process. A named pipe, however, can last as long as the system is up, beyond the life of the process. It can be deleted if no longer used.
Usually a named pipe appears as a file, and generally processes attach to it for IPC.

 

Once named pipes were shortly explained for those who hear it for a first time, its time to say named pipe in unix / linux is created with mkfifo command, syntax is straight foward:
 

mkfifo /tmp/name-of-named-pipe


Some older Linux-es with older bash and older bash shell scripts were using mknod.
So idea behind logging script is to use a simple named pipe read input and use date command to log the exact time the command was executed, here is the script.

 

#!/bin/bash
named_pipe='/tmp/output-named-pipe';
output_named_log='
/tmp/output-named-log.txt ';

if [ -p $named_pipe ]; then
rm -f $named_pipe
fi
mkfifo $named_pipe

while true; do
read LINE <$named_pipe
echo $(date): "$LINE" >>/tmp/output-named-log.txt
done


To write out any other script output and get logged now, any of your output with a nice current date command generated output write out any output content to the loggin buffer like so:

 

echo 'Using Named pipes is so cool' > /tmp/output-named-pipe
echo 'Disk is full on a trigger' > /tmp/output-named-pipe

  • Getting the output with the date timestamp

# cat /tmp/output-named-log.txt
Mon Aug 26 15:21:29 EEST 2019: Using Named pipes is so cool
Mon Aug 26 15:21:54 EEST 2019: Disk is full on a trigger


If you wonder why it is better to use Named pipes for logging, they perform better (are generally quicker) than Unix sockets.

 

3. Logging files to system log files with logger

 

If you need to do a one time quick way to log any message of your choice with a standard Logging timestamp, take a look at logger (a part of bsdutils Linux package), and is a command which is used to enter messages into the system log, to use it simply invoke it with a message and it will log your specified output by default to /var/log/syslog common logfile

 

root@linux:/root# logger 'Here we go, logging'
root@linux:/root # tail -n 3 /var/log/syslog
Aug 26 15:41:01 localhost CRON[24490]: (root) CMD (chown qscand:qscand -R /var/run/clamav/ 2>&1 >/dev/null)
Aug 26 15:42:01 localhost CRON[24547]: (root) CMD (chown qscand:qscand -R /var/run/clamav/ 2>&1 >/dev/null)
Aug 26 15:42:20 localhost hipo: Here we go, logging

 

If you have took some time to read any of the init.d scripts on Debian / Fedora / RHEL / CentOS Linux etc. you will notice the logger logging facility is heavily used.

With logger you can print out message with different priorities (e.g. if you want to write an error message to mail.* logs), you can do so with:
 

 logger -i -p mail.err "Output of mail processing script"


To log a normal non-error (priority message) with logger to /var/log/mail.log system log.

 

 logger -i -p mail.notice "Output of mail processing script"


A whole list of supported facility named priority valid levels by logger (as taken of its current Linux manual) are as so:

 

FACILITIES AND LEVELS
       Valid facility names are:

              auth
              authpriv   for security information of a sensitive nature
              cron
              daemon
              ftp
              kern       cannot be generated from userspace process, automatically converted to user
              lpr
              mail
              news
              syslog
              user
              uucp
              local0
                to
              local7
              security   deprecated synonym for auth

       Valid level names are:

              emerg
              alert
              crit
              err
              warning
              notice
              info
              debug
              panic     deprecated synonym for emerg
              error     deprecated synonym for err
              warn      deprecated synonym for warning

       For the priority order and intended purposes of these facilities and levels, see syslog(3).

 


If you just want to log to Linux main log file (be it /var/log/syslog or /var/log/messages), depending on the Linux distribution, just type', even without any shell quoting:

 

logger 'The reason to reboot the server Currently was a System security Update

 

So what others is logger useful for?

 In addition to being a good diagnostic tool, you can use logger to test if all basic system logs with its respective priorities work as expected, this is especially
useful as I've seen on a Cloud Holsted OpenXEN based servers as a SAP consultant, that sometimes logging to basic log files stops to log for months or even years due to
syslog and syslog-ng problems hungs by other thirt party scripts and programs.
To test test all basic logging and priority on system logs as expected use the following logger-test-all-basic-log-logging-facilities.sh shell script.

 

#!/bin/bash
for i in {auth,auth-priv,cron,daemon,kern, \
lpr,mail,mark,news,syslog,user,uucp,local0 \
,local1,local2,local3,local4,local5,local6,local7}

do        
# (this is all one line!)

 

for k in {debug,info,notice,warning,err,crit,alert,emerg}
do

logger -p $i.$k "Test daemon message, facility $i priority $k"

done

done

Note that on different Linux distribution verions, the facility and priority names might differ so, if you get

logger: unknown facility name: {auth,auth-priv,cron,daemon,kern,lpr,mail,mark,news, \
syslog,user,uucp,local0,local1,local2,local3,local4, \
local5,local6,local7}

check and set the proper naming as described in logger man page.

 

4. Using a file descriptor that will output to a pre-set log file


Another way is to add the following code to the beginning of the script

#!/bin/bash
exec 3>&1 4>&2
trap 'exec 2>&4 1>&3' 0 1 2 3
exec 1>log.out 2>&1
# Everything below will go to the file 'log.out':

The code Explaned

  •     Saves file descriptors so they can be restored to whatever they were before redirection or used themselves to output to whatever they were before the following redirect.
    trap 'exec 2>&4 1>&3' 0 1 2 3
  •     Restore file descriptors for particular signals. Not generally necessary since they should be restored when the sub-shell exits.

          exec 1>log.out 2>&1

  •     Redirect stdout to file log.out then redirect stderr to stdout. Note that the order is important when you want them going to the same file. stdout must be redirected before stderr is redirected to stdout.

From then on, to see output on the console (maybe), you can simply redirect to &3. For example
,

echo "$(date) : Do print whatever you want logging to &3 file handler" >&3


I've initially found out about this very nice bash code from serverfault.com's post how can I fully log all bash script actions (but unfortunately on latest Debian 10 Buster Linux  that is prebundled with bash shell 5.0.3(1)-release the code doesn't behave exactly, well but still on older bash versions it works fine.

Sum it up


To shortlysummarize there is plenty of ways to do logging from a shell script logger command but using a function or a named pipe is the most classic. Sometimes if a script is supposed to write user or other script output to a a common file such as syslog, logger command can be used as it is present across most modern Linux distros.
If you have a better ways, please drop a common and I'll add it to this article.

 

How to shutdown Windows after 1, 2, 3, 4 etc. X hours with a batch script – Shutdown / Reboot / Logoff Windows with a quick command

Wednesday, August 17th, 2016

https://www.pc-freak.net/images/windows-pc-server-shutdown-after-3-5-hours-howto-shutdown-windows-with-command-batch

I recently wondered how it is possible to shutdown Windows in some prior set time lets say in 30 minutes, 1 hour, 3 hours or 8 hours.

That's handy especially on servers that are being still in preparation install time and you have left some large files copy job (if you're migration files) from Old server environment to a new one
or if you just need to let your home WIndows PC shutdown to save electricity after some time (a very useful example is if you're downloading some 200GB of data which are being estimated to complete in 3 hours but you need to get out and be back home in 2 or 4 days and you don't want to bother connecting remotely to your PC with VNC or teamviewer then just scheduling the PC / server to shutdown in 3 hours with a simple is perfect solution to the task, here is how:

1. Open Command Prompt (E.g. Start menu -> Run and type CMD.EXE)

2. Type in command prompt

 

shutdown -s -t 10800

 

If you by mistake has typed it to shutdown earlier and suddenly you find out your PC needs to be running for a short more time in order to cancel the scheduled Shutdown type:

 

shutdown -a


Shutdown Windows command -s flag has also a possibiltiy to not shutdown but just logoff or if you just need to have the system rebooted a reboot option:
 


options    effect
-l         to log off
-r         to reboot

If you need to shutdown the PC after half an hour use instead the command:

 

shutdown -s -t 1800


shutdown-windows-pc-with-command-in-half-an-hour-screenshot.gif


Half an hour is 1800 seconds for one hour delayed shutdown use 3600 for 3 hours, that would be 3*3600 10800, for 5 hours 5*3600 = 18000 seconds and so on

 


An alternative way to do it with a short VBscript, here is an example:

Set objShell = CreateObject("WScript.Shell")

Dim Input
Input = "10:00"

'Input = InputBox("Enter the shutdown time here.","", "10:00")

For i = 1 to 2

CurrentTime = Time & VbCrLf

If Left(CurrentTime,5) = Input Then

objShell.Run "shutdown -s -t 00", 0
WScript.Quit 1

Else

WScript.Sleep 1000

End If

i=i-1

Next

Enjoy

How to Remove / Add SuSE Linux start service command

Thursday, July 2nd, 2015

opensuse-remove-add-new-service-geeko-suse-linux-mini-logo
If you happen to administer SUSE LINUX Enterprise Server 9 (x86_64) and you need to add or remove already existing /etc/init.d script or custom created Apache / Tomcat .. etc. service and you're already familiar with Fedora's / RHEL chkconfig, then the good news chkconfig is also available on SuSE and you can use in same way chkconfig to start / stop / enable / disable boot time services.

To list all available boot time init.d services use:
 

suse-linux:/etc # chkconfig –list

 

SuSEfirewall2_final       0:off  1:off  2:off  3:off  4:off  5:off  6:off
SuSEfirewall2_init        0:off  1:off  2:off  3:off  4:off  5:off  6:off
SuSEfirewall2_setup       0:off  1:off  2:off  3:off  4:off  5:off  6:off
Tivoli_lcfd1.bkp          0:off  1:off  2:off  3:off  4:off  5:off  6:off
activate_web_all          0:off  1:off  2:off  3:on   4:off  5:on   6:off
alsasound                 0:off  1:off  2:on   3:on   4:off  5:on   6:off
apache2                   0:off  1:off  2:off  3:off  4:off  5:off  6:off
apache2-eis               0:off  1:off  2:off  3:on   4:off  5:off  6:off
atd                       0:off  1:off  2:off  3:off  4:off  5:off  6:off
audit                     0:off  1:off  2:off  3:off  4:off  5:off  6:off
autofs                    0:off  1:off  2:off  3:off  4:off  5:off  6:off
autoyast                  0:off  1:off  2:off  3:off  4:off  5:off  6:off
boot.clock                0:off  1:off  2:off  3:off  4:off  5:off  6:off
boot.crypto               0:off  1:off  2:off  3:off  4:off  5:off  6:off
boot.device-mapper        0:off  1:off  2:off  3:off  4:off  5:off  6:off
boot.evms                 0:off  1:off  2:off  3:off  4:off  5:off  6:off
boot.idedma               0:off  1:off  2:off  3:off  4:off  5:off  6:off
boot.ipconfig             0:off  1:off  2:off  3:off  4:off  5:off  6:off
boot.isapnp               0:off  1:off  2:off  3:off  4:off  5:off  6:off
boot.klog                 0:off  1:off  2:off  3:off  4:off  5:off  6:off
boot.ldconfig             0:off  1:off  2:off  3:off  4:off  5:off  6:off
boot.loadmodules          0:off  1:off  2:off  3:off  4:off  5:off  6:off
boot.localfs              0:off  1:off  2:off  3:off  4:off  5:off  6:off
boot.localnet             0:off  1:off  2:off  3:off  4:off  5:off  6:off
boot.lvm                  0:off  1:off  2:off  3:off  4:off  5:off  6:off
boot.md                   0:off  1:off  2:off  3:off  4:off  5:off  6:off
boot.multipath            0:off  1:off  2:off  3:off  4:off  5:off  6:off
boot.proc                 0:off  1:off  2:off  3:off  4:off  5:off  6:off
boot.restore_permissions  0:off  1:off  2:off  3:off  4:off  5:off  6:off
boot.rootfsck             0:off  1:off  2:off  3:off  4:off  5:off  6:off
boot.sched                0:off  1:off  2:off  3:off  4:off  5:off  6:off
boot.scpm                 0:off  1:off  2:off  3:off  4:off  5:off  6:off
boot.scsidev              0:off  1:off  2:off  3:off  4:off  5:off  6:off
boot.shm                  0:off  1:off  2:off  3:off  4:off  5:off  6:off
boot.swap                 0:off  1:off  2:off  3:off  4:off  5:off  6:off
boot.sysctl               0:off  1:off  2:off  3:off  4:off  5:off  6:off
boot.udev                 0:off  1:off  2:off  3:off  4:off  5:off  6:off
coldplug                  0:off  1:on   2:on   3:on   4:off  5:on   6:off

 

To then stop the service:
 

suse-linux:/etc # chkconfig gtiweb off


If you prefer to do it the SuSE way and learn a bit more on SuSE boot time process check out:

 

suse-linux:/etc # man insserv


Removing already existing SuSE start-up script from init.d start up with insserv is done with:

suse-linux:/etc # cd /etc/init.d/
suse-linux:etc/init.d # insserv -r gtiweb
insserv: script ipmi.hp: service ipmidrv already provided!
insserv: script boot.multipath.2008-10-29: service boot.multipath already provided!


To install a new custom written and placed into /etc/inti.d/ on SuSE's server boot time with insserv:

 

suse-linux:/etc/init.d/ # insserv your_custom_script_name

Fix MySQL ibdata file size – ibdata1 file growing too large, preventing ibdata1 from eating all your server disk space

Thursday, April 2nd, 2015

fix-solve-mysql-ibdata-file-size-ibdata1-file-growing-too-large-and-preventing-ibdata1-from-eating-all-your-disk-space-innodb-vs-myisam

If you're a webhosting company hosting dozens of various websites that use MySQL with InnoDB  engine as a backend you've probably already experienced the annoying problem of MySQL's ibdata1 growing too large / eating all server's disk space and triggering disk space low alerts. The ibdata1 file, taking up hundreds of gigabytes is likely to be encountered on virtually all Linux distributions which run default MySQL server <= MySQL 5.6 (with default distro shipped my.cnf). The excremental ibdata1 raise appears usually due to a application software bug on how it queries the database. In theory there are no limitation for ibdata1 except maximum file size limitation set for the filesystem (and there is no limitation option set in my.cnf) meaning it is quite possible that under certain conditions ibdata1 grow over time can happily fill up your server LVM (Storage) drive partitions.

Unfortunately there is no way to shrink the ibdata1 file and only known work around (I found) is to set innodb_file_per_table option in my.cnf to force the MySQL server create separate *.ibd files under datadir (my.cnf variable) for each freshly created InnoDB table.
 

1. Checking size of ibdata1 file

On Debian / Ubuntu and other deb based Linux servers datadir is /var/lib/mysql/ibdata1

server:~# du -hsc /var/lib/mysql/ibdata1
45G     /var/lib/mysql/ibdata1
45G     total


2. Checking info about Databases and Innodb storage Engine

server:~# mysql -u root -p
password:

mysql> SHOW DATABASES;
+——————–+
| Database           |
+——————–+
| information_schema |
| bible              |
| blog               |
| blog-sezoni        |
| blogmonastery      |
| daniel             |
| ezmlm              |
| flash-games        |


Next step is to get some understanding about how many existing InnoDB tables are present within Database server:

 

mysql> SELECT COUNT(1) EngineCount,engine FROM information_schema.tables WHERE table_schema NOT IN ('information_schema','performance_schema','mysql') GROUP BY engine;
+————-+——–+
| EngineCount | engine |
+————-+——–+
|         131 | InnoDB |
|           5 | MEMORY |
|         584 | MyISAM |
+————-+——–+
3 rows in set (0.02 sec)

To get some more statistics related to InnoDb variables set on the SQL server:
 

mysqladmin -u root -p'Your-Server-Password' var | grep innodb


Here is also how to find which tables use InnoDb Engine

mysql> SELECT table_schema, table_name
    -> FROM INFORMATION_SCHEMA.TABLES
    -> WHERE engine = 'innodb';

+————–+————————–+
| table_schema | table_name               |
+————–+————————–+
| blog         | wp_blc_filters           |
| blog         | wp_blc_instances         |
| blog         | wp_blc_links             |
| blog         | wp_blc_synch             |
| blog         | wp_likes                 |
| blog         | wp_wpx_logs              |
| blog-sezoni  | wp_likes                 |
| icanga_web   | cronk                    |
| icanga_web   | cronk_category           |
| icanga_web   | cronk_category_cronk     |
| icanga_web   | cronk_principal_category |
| icanga_web   | cronk_principal_cronk    |


3. Check and Stop any Web / Mail / DNS service using MySQL

server:~# ps -efl |grep -E 'apache|nginx|dovecot|bind|radius|postfix'

Below cmd should return empty output, (e.g. Apache / Nginx / Postfix / Radius / Dovecot / DNS etc. services are properly stopped on server).

4. Create Backup dump all MySQL tables with mysqldump

Next step is to create full backup dump of all current MySQL databases (with mysqladmin):

server:~# mysqldump –opt –allow-keywords –add-drop-table –all-databases –events -u root -p > dump.sql
server:~# du -hsc /root/dump.sql
940M    dump.sql
940M    total

 

If you have free space on an external backup server or remotely mounted attached (NFS or SAN Storage) it is a good idea to make a full binary copy of MySQL data (just in case something wents wrong with above binary dump), copy respective directory depending on the Linux distro and install location of SQL binary files set (in my.cnf).
To check where are MySQL binary stored database data (check in my.cnf):

server:~# grep -i datadir /etc/mysql/my.cnf
datadir         = /var/lib/mysql

If server is CentOS / RHEL Fedora RPM based substitute in above grep cmd line /etc/mysql/my.cnf with /etc/my.cnf

if you're on Debian / Ubuntu:

server:~# /etc/init.d/mysql stop
server:~# cp -rpfv /var/lib/mysql /root/mysql-data-backup

Once above copy completes, DROP all all databases except, mysql, information_schema (which store MySQL existing user / passwords and Access Grants and Host Permissions)

5. Drop All databases except mysql and information_schema

server:~# mysql -u root -p
password:

 

mysql> SHOW DATABASES;

DROP DATABASE blog;
DROP DATABASE sessions;
DROP DATABASE wordpress;
DROP DATABASE micropcfreak;
DROP DATABASE statusnet;

          etc. etc.

ACHTUNG !!! DON'T execute!DROP database mysql; DROP database information_schema; !!! – cause this might damage your User permissions to databases

6. Stop MySQL server and add innodb_file_per_table and few more settings to prevent ibdata1 to grow infinitely in future

server:~# /etc/init.d/mysql stop

server:~# vim /etc/mysql/my.cnf
[mysqld]
innodb_file_per_table
innodb_flush_method=O_DIRECT
innodb_log_file_size=1G
innodb_buffer_pool_size=4G

Delete files taking up too much space – ibdata1 ib_logfile0 and ib_logfile1

server:~# cd /var/lib/mysql/
server:~#  rm -f ibdata1 ib_logfile0 ib_logfile1
server:~# /etc/init.d/mysql start
server:~# /etc/init.d/mysql stop
server:~# /etc/init.d/mysql start
server:~# ps ax |grep -i mysql

 

You should get no running MySQL instance (processes), so above ps command should return blank.
 

7. Re-Import previously dumped SQL databases with mysql cli client

server:~# cd /root/
server:~# mysql -u root -p < dump.sql

Hopefully import should went fine, and if no errors experienced new data should be in.

Altearnatively if your database is too big and you want to import it in less time to mitigate SQL downtime, instead import the database with:

server:~# mysql -u root -p
password:
mysql>  SET FOREIGN_KEY_CHECKS=0;
mysql> SOURCE /root/dump.sql;
mysql> SET FOREIGN_KEY_CHECKS=1;

 

If something goes wrong with the import for some reason, you can always copy over sql binary files from /root/mysql-data-backup/ to /var/lib/mysql/
 

8. Connect to mysql and check whether databases are listable and re-check ibdata file size

Once imported login with mysql cli and check whther databases are there with:

server:~# mysql -u root -p
SHOW DATABASES;

Next lets see what is currently the size of ibdata1, ib_logfile0 and ib_logfile1
 

server:~# du -hsc /var/lib/mysql/{ibdata1,ib_logfile0,ib_logfile1}
19M     /var/lib/mysql/ibdata1
1,1G    /var/lib/mysql/ib_logfile0
1,1G    /var/lib/mysql/ib_logfile1
2,1G    total

Now ibdata1 will grow, but only contain table metadata. Each InnoDB table will exist outside of ibdata1.
To better understand what I mean, lets say you have InnoDB table named blogdb.mytable.
If you go into /var/lib/mysql/blogdb, you will see two files
representing the table:

  •     mytable.frm (Storage Engine Header)
  •     mytable.ibd (Home of Table Data and Table Indexes for blogdb.mytable)

Now construction will be like that for each of MySQL stored databases instead of everything to go to ibdata1.
MySQL 5.6+ admins could relax as innodb_file_per_table is enabled by default in newer SQL releases.


Now to make sure your websites are working take few of the hosted websites URLs that use any of the imported databases and just browse.
In my case ibdata1 was 45GB after clearing it up I managed to save 43 GB of disk space!!!

Enjoy the disk saving! 🙂

Apache Webserver disable file extension – Forbid / Deny access in Apache config to certain file extensions for a Virtualhost

Monday, March 30th, 2015


If you're a Webhosting company sysadmin like me and you already have configured directory listing for certain websites / Vhosts and those files are mirrored from other  development webserver location but some of the uploaded developer files extensions which are allowed to be interptered such as php include files .inc / .htaccess mod_rewrite rules / .phps / .html / .txt need to be working on the dev / test server but needs to be disabled (excluded) from delivery or interpretting for some directory on the prod server.

Open Separate host VirtualHost file or Apache config (httpd.conf / apache2.conf)  if all Vhosts  for which you want to disable certain file extensions and add inside:

<Directory "/var/www/sploits">
        AllowOverride All

</Directory>

 

Extension Deny Rules such as:

For disabling .inc files from inclusion from other PHP sources:
 

<Files  ~ "\.inc$">
  Order allow,deny
  Deny from all
</Files>


To Disable access to .htaccess single file only

 

<Files ~ "^\.htaccess">
  Order allow,deny
  Deny from all
</Files>


To Disable .txt from being served by Apache and delivered to requestor browser:

 

<Files  ~ "\.txt$">
  Order allow,deny
  Deny from all
</Files>

 


To Disable any left intact .html from being delivered to client:

 

<Files  ~ "\.html$">
  Order allow,deny
  Deny from all
</Files>

 

Do it for as many extensions as you need.
Finally to make changes affect restart Apache as usual:

If on Deb based Linux issue:

/etc/init.d/apach2 restart


On CentOS / RHEL and other Redhats / RedHacks 🙂

/etc/init.d/httpd restart

How to completely disable Replication in MySQL server 5.1.61 on Debian GNU / Linux

Monday, July 16th, 2012

Replication_mysql_disable

Some time ago on one of the Database MySQL servers, I've configured replication as it was required to test somethings. Eventually it turned out replication will be not used (for some reason) it was too slow and not fitting our company needs hence we needed to disable it.

It seemed logical to me that, simply removing any replication related directives from my.cnf and a restart of the SQL server will be enough to turn replication off on the Debian Linux host. Therefore I proceeded removed all replication configs from /etc/my/my.cnf and issued MySQL restart i. e.:

sql-server:~# /etc/init.d/mysql restart
....

This however didn't turned off replication,as I thought and in phpmyadminweb frontend interface, replication was still appearing to be active in the replication tab.

Something was still making the SQL server still act as an Replication Slave Host, so after a bit of pondering and trying to remember, the exact steps I took to make the replication work on the host I remembered that actually I issued:

mysql> START SLAVE;

Onwards I run:

mysql> SHOW SLAVE STATUS;
....

and found in the database the server was still running in Slave Replication mode

Hence to turn off the db host run as a Slave, I had to issue in mysql cli:

mysql> STOP SLAVE;
Query OK, 0 rows affected, 1 warning (0.01 sec)
mysql> RESET SLAVE;
Query OK, 0 rows affected, 1 warning (0.01 sec)

Then after a reload of SQL server in memory, the host finally stopped working as a Slave Replication host, e.g.

sql-server:~# /etc/init.d/mysql restart
....

After the restart, to re-assure myself the SQL server is no more set to run as MySQL replication Slave host:

mysql> SHOW SLAVE STATUS;
Empty set (0.00 sec)

Cheers 😉

How to install VirtualBox Virtual Machine to run Windows XP on Ubuntu Linux (11.10)

Tuesday, January 17th, 2012

Enable_VirtualBox_Windows_XP-fullscreen-with-vboxguest-additions-iso
My beloved sister was complaining games were failing to properly be played with wine emulator , therefore I decided to be kind and help her by installing a Windows XP to run inside a Virtual Machine.My previous install experiments with running MS Windows XP on Linux was on Debian using QEMU virtualmachine emulator.
However as Qemu is a bit less interactive and slower virtualmachine for running Windows (though I prefer it for being completely free software), this time I decided to install the Windows OS with Virtualbox.

My hope was using VirtualBox would be a way easier but I was wrong… I've faced few troubles and I thought many people who initially try to install Virtualbox VM to run Windows on Ubuntu and other Debian based Linux distros will probably experience the same problems as mine, so here is how this article was born.

Here is what I did to have a VirtualBox OS emulator to run Windows XP SP2 on Ubuntu 11.10 Linux

1. Install Virtualbox required packages with apt

root@ubuntu:~# apt-get install virtualbox virtualbox-dkms virtualbox-guest-dkms root@ubuntu:~# apt-get install virtualbox-ose-dkms virtualbox-guest-utils virtualbox-guest-x11
...

If you prefer more GUI or lazy to type commands, the Software Package Manager can also be used to straight install the same packages.
virtualbox-dkms virtualbox-guest-dkms packages are the two which are absolutely necessery in order to enable VirtualBox to support installing Microsoft Windows XP. DKMS modules are also necessery to be able to emulate some other proprietary (non-free) operating systems.
The DKMS packages provide a source for building Vbox guest (OS) additional kernel modules. They also require the kernel source to be install otherwise they fail to compile.

Failing to build the DKMS modules will give you error every time you try to create new VirtualMachine container for installing a fresh Windows XP.
The error happens if the two packages do not properly build the vboxdrv extra Vbox kernel module while the Windows XP installer is loaded from a CD or ISO. The error to pop up is:

Kernel driver not installed (rc=-1908)

The VirtualBox Linux kernel driver (vboxdrv) is either not loaded or there is a permission problem with /dev/vboxdrv. Please reinstall the kernel module by executing

VirtualBox vboxdrv not loaded error Ubuntu Screen

To fix the error:

2. Install latest Kernel source that corresponds to your current kernel version

root@ubuntu:~# apt-get install linux-headers-`uname -r`
...

Next its necessery to rebuild the DKMS modules using dpkg-reconfigure:

3. Rebuild VirtualBox DKMS deb packages

root@ubuntu:~# dpkg-reconfigure virtualbox-dkms
...
root@ubuntu:~# dpkg-reconfigure virtualbox-guest-dkms
...
root@ubuntu:~# dpkg-reconfigure virtualbox-ose-dkms
...

Hopefully the copilation of vboxdrv kernel module should complete succesfully.
To test if all is fine just load the module:

4. Load vboxdrv virtualbox kernel module

root@ubuntu:~# modprobe vboxdrv
root@ubuntu:~#

If you get some error during loading, this means vboxdrv failed to properly compile, try read thoroughfully what the error is and fix it) ;).

As a next step the vboxdrv has to be set to load on every system boot.

5. Set vboxdrv to load on every Ubuntu boot

root@ubuntu:~# echo 'vboxdrv' >> /etc/modules

I am not sure if this step is required, it could be /etc/init.d/virtualbox init script automatically loads the module, anyways putting it to load on boot would do no harm, so better do it.

That's all now, you can launch VirtualBox and use the New button to initiate a new Virtual Machine, I will skip explaining how to do the configurations for a Windows XP as most of the configurations offered by default would simply work without any tampering.

After booting the Windows XP installer I simply followed the usual steps to install Windows and all went smoothly.
Below you see a screenshot showing the installed Windows XP Virtualbox saved VM session. The screenshot letters are in Bulgarian as my sisters default lanaguage for Ubuntu is bulgarian 😉

VirtualBox installed MS Windows VM screenshot

I hope this article helps someone out there. Please drop me a comment if you experience any troubles with it. Cya 🙂

Speed up WordPress / Joomla CMS and MySQL server on Linux with tmpfs ram file system / Decrease Website pageload times with RAM caching

Wednesday, March 4th, 2015

speed-up-accelerate-wordpress-joomla-drupal-cms-and-mysql-server-with-tmpfs_ramfs_decrease-pageload-times-with-ram-caching
As a WordPress blog owner and an sys admin that has to deal with servers running a lot of WordPress / Joomla / Droopal and other custom CMS installed on servers, performoing slow or big enough to put a significant load on servers
and I love efficiency and hardware cost saving is essential for my daily job, I'm constantly trying to find new ways to optimize Customer Website (WordPress) and rest of sites in order to utilize better our servers and improve our clients sites speed (and hence satisfaction). 

There is plenty of little things to do on servers but probably among the most crucial ones which we use nowadays that save us a lot of money is tmpfs, and earlier (ramfs) – previously known as shmfs).
TMPFS is a (Temporary File Storage Facility) Linux kernel technology based on ramfs (used by Linux kernel initrd / initramfs on boot time in order to load and store the Linux kernel in memory, before system hard disk partition file systems are mounted) which is heavily used by virtually all modern popular Linux distributions. 

Using ramfs (cramfs variation – Compressed ROM filesystem) has been used to store different system environment kernel and Desktop components of many Linux environment / applications and used by a lot of the Linux BootCD such as the most famous (Klaus Knopper's) KNOPPIX LiveCD and Trinity Rescue Kit Linux (TRK uses /dev/shm which btw can be seen on most modern Linux distros and is actually just another mounted tmpfs).
If you haven't tried Live Linux yet try it out as me and a lot of sysadmins out there use some kind of LiveLinux at least few times on yearly basis  to Recover Unbootable Linux servers after some applied remote Updates as well as for Rescuing (Save) Data from Linux server failing to properly boot because of hard disk (bad blocks) failures. As I said earlier TMPFS is also used on almost any distribution for the /dev/ filesystem which is kept in memory.

You can see which tmpfs partitions is used on your Linux server with:

 

debian-server:~# mount |grep -i tmpfs
tmpfs on /lib/init/rw type tmpfs (rw,nosuid,mode=0755)
udev on /dev type tmpfs (rw,mode=0755)
tmpfs on /dev/shm type tmpfs (rw,nosuid,nodev)

 

Above is an output from a standard Debian Linux server. On CentOS 7 standard mounted tmpfs are as follows:

 

[root@centos ~]# mount |grep -i tmpfs
devtmpfs on /dev type devtmpfs (rw,nosuid,seclabel,size=1016332k,nr_inodes=254083,mode=755)
tmpfs on /dev/shm type tmpfs (rw,nosuid,nodev,seclabel)
tmpfs on /run type tmpfs (rw,nosuid,nodev,seclabel,mode=755)
tmpfs on /sys/fs/cgroup type tmpfs (rw,nosuid,nodev,noexec,seclabel,mode=755)

 

[root@centos ~]# df -h|grep -i tmpfs
devtmpfs                 993M     0  993M   0% /dev
tmpfs                   1002M   92K 1002M   1% /dev/shm
tmpfs                   1002M  8.8M  993M   1% /run
tmpfs                   1002M     0 1002M   0% /sys/fs/cgroup

The /run tmpfs mounted directory is also to be seen also on latest Ubuntus and Fedoras and is actually the good old /var/run ( where applications keep there pids and some small app related files) stored in tmpfs filesystem stored in memory.

If you're wondering what is /dev/shm and why it appears mounted on every single Linux Server / Desktop you've ever used this is a special filesystem shared memory which various running programs (processes) can use to transfer data quick and efficient between each other to preven the slow disk swapping. People using Linux for the rest 15 years should remember /dev/shm has been a target of a lot of kernel exploits as historically it had a lot of security issues.

While writting this article I've just checked about KNOPPIX developed amd just for info as of time of writting this distro has already 1000+ programs on CD version and 2600+ packages / application on DVD version.
Nowadays Knoppix is mostly used mostly as USB Live Flash drive as a lot of people are dropping CD / DVD use (many servers doesn't have a CD / DVD Drive) and for USB Live Flash Linux distros tmpfs is also key technology used as this gives the end user an amazing fast experience (Desktop applications run much fasten on Live USBs when tmpfs is used than when the slow 7200 RPM HDDs are used).

Loading big parts of the distribution within RAM (with tmpfs from Linux Kernel 2.4+ onwards) is also heavily used by a lot of Cluster vendors in most of Clustered (Cloud) Linux based environemnts, cause TMPFS gives often speeds up improvements to x30 times and decreases greatly I/O HDD. FreeBSD users will be happy to know that TMPFS is already ported and could be used on from FreeBSD 7.0+ onward.

In this small article I will give you example use on how I use tmpfs to speed up our WordPress Websites which use WP Caching plugins such as W3 Total Cache and WP Super Cache
and Hyper Cache / WP Super Cache disk caching and MySQL server as a Database backend.
Below example is wordpress specific but since it can be easily applied to JoomlaDrupal or any other CMS out there that uses mySQL server to make a lot of CPU expensive memory hungry (LEFT JOIN) queries which end up using a slow 7200 RPM hard disk.


 

1. Preparing tmpfs partitions for WordPress File Cache directory
 

If you want to give tmpfs a test drive, I recommend you try to create / mount a 20 Megabyte partition. To create a tmpfs partition you don't need to use a tool like mkfs.ext3 / mkfs.ext4 as TMPFS is in reality a virtual filesystem that is mapped in the server system physical RAM (volatile memory). TMPFS is very nice because if you run out of free RAM system starts a combination of RAM use + some Hard disk SWAP 
The great thing about TMPFS is it never uses all of the available RAM and SWAP, which would not halt your server if TMPFS partition gets filled, but instead you will start getting the usual "Insufficient Disk Space", just like with a physical HDD parititon. RAMFS cares much less about server compared to TMPFS, because if RAMFS is historically older.

ramfs file systems cannot be limited in size like a disk base file system which is limited by it’s capacity, thus ramfs will continue using memory storage until the system runs out of RAM and likely crashes or becomes unresponsive. This is a problem if the application writing to the file system cannot be limited in total size, so in my opinion you better stay away from RAMFS except you have a good idea what you're doing. Another disadvantage of RAMFS compared to TMPFS is you cannot see the size of the file system in df and it can only be estimated by looking at the cached entry in free.

Note that before proceeding to use TMPFS or RAMFS you should know besides having advantages, there are certain serious disadvantage that if the server using tmpfs (in RAM) to store files crashes the customer might loose his data, therefore using RAM filesystems on Production servers is best to be used just for caching folders which are regularly synchronized with (rsync) to some folder to assure no data will be lost on server reboot or crash.

Memory of fast storage areas are ideally suited for applications which need repetitively small data areas for caching or using as temporary space such as Jira (Issue and Proejct Tracking Software) Indexing  As the data is lost when the machine reboots the tmpfs stored data must not be data of high importance as even scheduling backups cannot guarantee that all the data will be replicated in the even of a system crash.

To test mounting a tmpfs virtual (memory stored) filesystem issue:
 

mount -t tmpfs tmpfs -o size=256m /mnt/tmpfs


If you want to test mount a ramfs instead:

 

 mount -t ramfs -o size=256m ramfs /mnt/ramfs

 

debian-server:~#  mount |grep -i -E "ramfs|tmpfs"
tmpfs on /lib/init/rw type tmpfs (rw,nosuid,mode=0755)
udev on /dev type tmpfs (rw,mode=0755)
tmpfs on /dev/shm type tmpfs (rw,nosuid,nodev)
tmpfs on /mnt/tmpfs type tmpfs (rw,size=256m)
ramfs on /mnt/ramfs type tmpfs (rw,size=256m)

 

Once mounted tmpfs can be used in the same way as any ext4 / reiserfs filesystem. In the same way to make mounts permanent, its necessery to add a line to /etc/fstab

To illustrate better a tmpfs use case on my blog running WordPress with W3TotalCache (W3TC) plugin cache folder in /var/www/blog/wp-content/w3tc to get advantage of tmpfs to store w3tc files.

a) Stop Apache

On Debian
 

debian-server:~# /etc/init.d/apache stop


On CentOS 
 

[root@centos ~]# /etc/init.d/httpd stop


b) Move w3tc dir to w3tc-bak

 

debian-server:~# cd /var/www/blog/wp-content/
debian-server:~# mv w3tc w3tc-bak

 

c) Create w3tc directory
 

debian-server:/var/www/blog/wp-content# mkdir w3tc
debian-server:/var/www/blog/wp-content# chown -R www-data:www-data w3tc


d) Add tmpfs record to /etc/fstab

My W3TC Cache didn't grow bigger than 2Gigabytes so I create a 2Giga directory for it by adding following in /etc/fstab 
 

debian-server:~# vim /etc/fstab

 

tmpfs /var/www/blog/wp-content/w3tc tmpfs defaults,size=2g,noexec,nosuid,uid=33,gid=33,mode=1755 0 0


You might also want to add the nr_inodes (option) to tmpfs while mounting. nr_inodes is the maximum inode for instance. Default is half the number of your physical RAM pages, (on a machine with highmem) the number of lowmem RAM page, some common option that should work is nr_inodes=5k, if you're unsure what this option does you can safely skip it 🙂

e) Mount new added tmpfs folder

Then to mount the newly added filesystem issue:
 

mount -a


Or if you're on a CentOS / RHEL server use httpd Apache user instead and whenever you have docroot and wordpress installed.

 

[root@centos ~]# chown -R apache:apache: w3tc


If you're using Apache SuPHP use whatever the UID / GID is proper.

On CentOS you will need to set proper UID and GID (UserID / GroupID), to find out which ones to to use check in /etc/passwd:
 

[root@centos ~]# grep -i apache /etc/passwd
apache:x:48:48:Apache:/var/www:/sbin/nologin


f) Move old w3tc cache from w3tc-bak to w3tc

 

debian-server:/var/www/blog/wp-content# mv w3tc-bak/* w3tc/

 

g) Start again Apache

On Debian:

 

debian-server:~# /etc/init.d/apache2 start

 


On CentOS:
 

[root@centos~]# /etc/init.d/httpd start

h) Keeping w3tc cache site folder synced

As I said earlier the biggest problem with caching (the reason why many hosting providers) and site admins refuse to use it is they might loose some data, to prevent data loss or at least mitigate the data loss to few minutes intervals it is a good idea to synchronize tmpfs kept folders somewhere to disk with rsync.

To achieve that use a cronjob like this:
 

debian-server:~# crontab -u root -e
*/5 * * * * /usr/bin/ionice -c3 -n7 /usr/bin/nice -n 19 /usr/bin/rsync -ah –stats –delete /var/www/blog/wp-content/w3tc/ /backups/tmpfs/cache/ 1>/dev/null


Note that you will need to have the /backups/tmpfs/cache folder existing, create it with:

 

debian-server:~# mkdir -p /backups/tmpfs/cache


You will also need to add a rsync synchronization from backupped folder to tmpfs (in case if the server gets accidently rebooted because it hanged or power outage), place in

/etc/rc.local

 

ionice -c3 -n7 nice -n 19 rsync -ahv –stats –delete /backups/tmpfs/cache/ /var/www/blog/wp-content/w3tc/ 1>/dev/null


(somewhere before exit 0) line
 

0 05 * * * /usr/bin/ionice -c3 -n7 /bin/nice -n 19 /usr/bin/rsync -ah –stats –delete /var/www/blog/wp-content/w3tc/ /backups/tmpfs/cache/ 1>/dev/null

 

 

2. Preparing tmpfs partitions for MySQL server temp File Cache directory


Its common that MySQL servers had to serve a lot of long and heavy SQL JOIN Queries mostly by related posts WP plugins such as (Zemanta Related Posts) and Contextual Related posts though MySQLs are well optimized  to work as much as efficient using mysql tuner (tuning primer) still often SQL servers get a lot of temp tables created to disk (about 25% to 30%) of all SQL queries use somehow HDD to serve queries and as this is very slow and there is file lock created the overall MySQL performance becomes sluggish at times to fix (resolve) that without playing with SQL code to optimize the slow queries the best way I found is by using TMPFS as MySQL temp folder.

To do so I create a TMPFS usually the size of 256 MB because this is usually enough for us, but other hosting companies might want to add bigger virtual temp disk:

a) Add tmpfs new dir to /etc/fstab

In /etc/fstab add below record with vim editor:
 

debian-server:~# vim /etc/fstab

 

tmpfs /var/mysqltmp tmpfs rw,gid=111,uid=108,size=256M,nr_inodes=10k,mode=0700 0 0

 

Note that the uid / and gid 105 and 114 are taken again from /etc/passwd

On Debian

debian-server:~# grep -i mysql /etc/passwd
mysql:x:108:111:MySQL Server,,,:/var/lib/mysql:/bin/false


On CentOS
 

[root@centos ~]# grep -i mysql /etc/passwd
mysql:x:27:27:MySQL Server:/var/lib/mysql:/bin/bash


b) Create folder /var/mysqltmp or whenever you want to place the tmpfs memory kept SQL folder

 

debian-server:~# mkdir /var/mysqltmp
debian-server:~# chown mysql:mysql /var/mysqltmp

 

debian-server:~# mount|grep -i tmpfs
tmpfs on /lib/init/rw type tmpfs (rw,nosuid,mode=0755)
udev on /dev type tmpfs (rw,mode=0755)
tmpfs on /dev/shm type tmpfs (rw,nosuid,nodev)
tmpfs on /var/www/blog/wp-content/w3tc type tmpfs (rw,noexec,nosuid,size=2g,uid=33,gid=33,mode=1755)
tmpfs on /var/mysqltmp type tmpfs (rw,gid=108,uid=111,size=256M,nr_inodes=10k,mode=0700)


c) Add new path to tmpfs created folder in my.cnf 

Then  edit /etc/mysql/my.cnf

 

debian-server:~# vim /etc/mysql/my.cnf

[mysqld]
#
# * Basic Settings
#
user        = mysql
pid-file    = /var/run/mysqld/mysqld.pid
socket      = /var/run/mysqld/mysqld.sock
port        = 3306
basedir     = /usr
datadir     = /var/lib/mysql
tmpdir      = /var/mysqltmp

 

On CentOS edit and change tmpdir in same way within /etc/my.cnf


d) Finally Restart Apache and MySQL to make mysql start using new set tmpfs memory kept folder

On Debian:
 

debian-server:~# /etc/init.d/apache2 stop; /etc/init.d/mysql restart; /etc/init.d/apache2 start

On CentOS:
 

[root@centos ~]# /etc/init.d/httpd stop; /etc/init.d/mysqld restart; /etc/initd/httpd start


Now monitor your server and check your pagespeed increase for me such an optimization usually improves site performance so site becomes +50% faster, to see the difference you can test your website before applying tmpfs caching for site and after that by using Google PageInsight (PageSpeed) Online Test. Though this example is for MySQL and WordPress you can easily adopt the same for Joomla if you have Joomla Caching enabled to some folder, same goes for any other CMS such as Drupal that can take use of Disk Caching. Actually its a small secret of many Hosting providers that allow clients to create sites via CPanel and Kloxo this tmpfs optimizations are already used for sites and by this the provider is able to offer better website service on lower prices. VPS hosting providers also use heavy caching. A lot of people are using TMPFS also to accelerate Sites that have enabled Google Pagespeed as Cacher and accelerator, as PageSpeed module puts a heavy HDD I/O load that can easily stone the server. Many admins also choose to use TMPFS for  /tmp, /var/run, and /var/lock directories as this leads often to significant overall server services operations improvement.
Once you have tmpfs enabled, It is a good idea to periodically monitor your SWAP used space with (df -h), because if you allocate bigger tmpfs partitions than your physical memory and tmpfs's full size starts to be used your machine will start swapping heavily and this could have a very negative performance affect.
 

debian-server:~# df -h|grep -i tmpfs
tmpfs            3,9G     0   3,9G   0% /lib/init/rw
tmpfs            3,9G     0   3,9G   0% /dev/shm
tmpfs            2,0G  1,4G   712M  66% /var/www/blog/wp-content/w3tc
tmpfs            256M     0   256M   0% /mnt/tmpfs
tmpfs            256M  236K   256M   1% /var/mysqltmp

The applications of tmpfs to accelerate services is up to your imagination, so I will be glad to hear from other admins on any interesting other application or problems faced while using TMPFS.

 Enjoy! 🙂

Auto restart Apache on High server load (bash shell script) – Fixing Apache server temporal overload issues

Saturday, March 24th, 2012

auto-restart-apache-on-high-load-bash-shell-script-fixing-apache-temporal-overload-issues

I've written a tiny script to check and restart, Apache if the server encounters, extremely high load avarage like for instance more than (>25). Below is an example of a server reaching a very high load avarage:;

server~:# uptime
13:46:59 up 2 days, 18:54, 1 user, load average: 58.09, 59.08, 60.05
load average: 0.09, 0.08, 0.08

Sometimes high load avarage is not a problem, as the server might have a very powerful hardware. A high load numbers is not always an indicator for a serious problems. Some 16 CPU dual core (2.18 Ghz) machine with 16GB of ram could probably work normally with a high load avarage like in the example. Anyhow as most servers are not so powerful having such a high load avarage, makes the machine hardly do its job routine.

In my specific, case one of our Debian Linux servers is periodically reaching to a very high load level numbers. When this happens the Apache webserver is often incapable to serve its incoming requests and starts lagging for clients. The only work-around is to stop the Apache server for a couple of seconds (10 or 20 seconds) and then start it again once the load avarage has dropped to less than "3".

If this temporary fix is not applied on time, the server load gets increased exponentially until all the server services (ssh, ftp … whatever) stop responding normally to requests and the server completely hangs …

Often this server overloads, are occuring at night time so I'm not logged in on the server and one such unexpected overload makes the server unreachable for hours.
To get around the sudden high periodic load avarage server increase, I've written a tiny bash script to monitor, the server load avarage and initiate an Apache server stop and start with a few seconds delay in between.

#!/bin/sh
# script to check server for extremely high load and restart Apache if the condition is matched
check=`cat /proc/loadavg | sed 's/\./ /' | awk '{print $1}'`
# define max load avarage when script is triggered
max_load='25'
# log file
high_load_log='/var/log/apache_high_load_restart.log';
# location of inidex.php to overwrite with temporary message
index_php_loc='/home/site/www/index.php';
# location to Apache init script
apache_init='/etc/init.d/apache2';
#
site_maintenance_msg="Site Maintenance in progress - We will be back online in a minute";
if [ $check -gt "$max_load" ]; then>
#25 is load average on 5 minutes
cp -rpf $index_php_loc $index_php_loc.bak_ap
echo "$site_maintenance_msg" > $index_php_loc
sleep 15;
if [ $check -gt "$max_load" ]; then
$apache_init stop
sleep 5;
$apache_init restart
echo "$(date) : Apache Restart due to excessive load | $check |" >> $high_load_log;
cp -rpf $index_php_loc.bak_ap $index_php_loc
fi
fi

The idea of the script is partially based on a forum thread – Auto Restart Apache on High Loadhttp://www.webhostingtalk.com/showthread.php?t=971304Here is a link to my restart_apache_on_high_load.sh script

The script is written in a way that it makes two "if" condition check ups, to assure 100% there is a constant high load avarage and not just a temporal 5 seconds load avarage jump. Once the first if is matched, the script first tries to reduce the server load by overwritting a the index.php, index.html script of the website with a one stating the server is ongoing a maintenance operations.
Temporary stopping the index page, often reduces the load in 10 seconds of time, so the second if case is not necessery at all. Sometimes, however this first "if" condition cannot decrease enough the load and the server load continues to stay too high, then the script second if comes to play and makes apache to be completely stopped via Apache init script do 2 secs delay and launch the apache server again.

The script also logs about, the load avarage encountered, while the server was overloaded and Apache webserver was restarted, so later I can check what time the server overload occured.
To make the script periodically run, I've scheduled the script to launch every 5 minutes as a cron job with the following cron:

# restart Apache if load is higher than 25
*/5 * * * * /usr/sbin/restart_apache_on_high_load.sh >/dev/null 2>&1

I have also another system which is running FreeBSD 7_2, which is having the same overload server problems as with the Linux host.
Copying the auto restart apache on high load script on FreeBSD didn't work out of the box. So I rewrote a little chunk of the script to make it running on the FreeBSD host. Hence, if you would like to auto restart Apache or any other service on FreeBSD server get /usr/sbin/restart_apache_on_high_load_freebsd.sh my script and set it on cron on your BSD.

This script is just a temporary work around, however as its obvious that the frequency of the high overload will be rising with time and we will need to buy new server hardware to solve permanently the issues, anyways, until this happens the script does a great job 🙂

I'm aware there is also alternative way to auto restart Apache webserver on high server loads through using monit utility for monitoring services on a Unix system. However as I didn't wanted to bother to run extra services in the background I decided to rather use the up presented script.

Interesting info to know is Apache module mod_overload exists – which can be used for checking load average. Using this module once load avarage is over a certain number apache can stop in its preforked processes current serving request, I've never tested it myself so I don't know how usable it is. As of time of writting it is in early stage version 0.2.2
If someone, have tried it and is happy with it on a busy hosting servers, please share with me if it is stable enough?