Posts Tagged ‘Maximum’

Linux extending life time for a damaged hard drive server tricks on a live server. Force fcsk on next reboot.Read-only file system error solutions

Friday, February 17th, 2023

linux-extending-life-time-for-a-damaged-hard-drive-server-tricks-can-not-read-superblock-linux-force-fsck-on-next-reboot

In our daily work as system administrators we have some very old Legacy systems running Clustered High Availability proxies using CRM (Cluster Resource Manager) and some legacy systems still using Heartbeat to manage the cluster instead of the newer and modern Corosync variant.

The HA cluster is only 2 nodes Linux machine and running the obscure already long time unsupported version of Redhat 5.11 (Ootpa) who was officially became stable distant year 1998 (yeath the years were good) and whose EOL (End of Life) has been reached long time ago and the OS is no longer supported, however for about 14 years the machines has been running perfectly fine until one of the Cluster nodes managed by ocf::heartbeat:IPAddr2 , that is  /etc/ha.d/resource.d/IPAddr2 shell script. Yeah for the newbies Heartbeat Application Cluster in Linux does work like that it uses a number of extendable pair of shell scripts written for different kind of Network / Web / Mail / SQL or whatever services HA management.

The first node configured however, started failing due to some errors like:
 

EXT3-fs error (device dm-1): ext3_journal_start_sb: Detected aborted journal
sd 0:2:0:0: rejecting I/O to offline device
Aborting journal on device sda1.
sd 0:2:0:0: rejecting I/O to offline device
printk: 159 messages suppressed.
Buffer I/O error on device sda1, logical block 526
lost page write due to I/O error on sda1
sd 0:2:0:0: rejecting I/O to offline device
sd 0:2:0:0: rejecting I/O to offline device
ext3_abort called.
EXT3-fs error (device sda1): ext3_journal_start_sb: Detected aborted journal
Remounting filesystem read-only
sd 0:2:0:0: rejecting I/O to offline device
sd 0:2:0:0: rejecting I/O to offline device
sd 0:2:0:0: rejecting I/O to offline device
sd 0:2:0:0: rejecting I/O to offline device
sd 0:2:0:0: rejecting I/O to offline device
megaraid_sas: FW was restarted successfully, initiating next stage…
megaraid_sas: HBA recovery state machine, state 2 starting…
megasas: Waiting for FW to come to ready state
megasas: FW in FAULT state!!
FW state [-268435456] hasn't changed in 180 secs
megaraid_sas: out: controller is not in ready state
megasas: waiting_for_outstanding: after issue OCR. 
megasas: waiting_for_outstanding: before issue OCR. FW state = f0000000
megaraid_sas: pending commands remain even after reset handling. megasas[0]: Dumping Frame Phys Address of all pending cmds in FW
megasas[0]: Total OS Pending cmds : 0 megasas[0]: 64 bit SGLs were sent to FW
megasas[0]: Pending OS cmds in FW :

The result out of that was a frequently the filesystem of the machine got re-mounted as Read Only and of course that is
quite bad if you have a running processess of haproxy that should be able to be living their and take up some Web traffic
for high availability and you run all the traffic only on the 2nd pair of machine.

This of course was a clear sign for a failing disks or some hit bad blocks regions or as the messages indicates, some
problem with system hardware or Raid SAS Array.

The physical raid on the system, just like rest of the hardware is very old stuff as well.

[root@haproxy_lb_node1 ~]# lspci |grep -i RAI
01:00.0 RAID bus controller: LSI Logic / Symbios Logic MegaRAID SAS 2108 [Liberator] (rev 05)

The produced errors not only made the machine to auto-mount its root / filesystem in Read-Only mode but besides has most
likely made the machine to automatically reboot every few days or few times every day in a raw.

The second Load Balancer node2 did operated perfectly, and we thought that we might just keep the broken machine in that half running
and inconsistent state for few weeks until we have built the new machines with Pre-Installed new haproxy cluster with modern
RedHat Linux 8.6 distribution, but since we have to follow SLAs (Service Line Agreements) with Customers and the end services behind the
High Availability (HA) Haproxy cluster were at danger … 

We as sysadmins had the task to make our best to try to stabilize the unstable node with disk errors for the system to servive
and be able to normally serve traffic (if node2 that is in a separate Data center fails due to a hardware or electricity issues etc.)
.

Here is few steps we took, that has hopefully improved the situation.

1. Make backups of most important files of high importance

Always before doing anything with a broken system, prepare backup of the most important files, if that is a cluster that should be a backup of the cluster configurations (if you don't have already ones) backup of /etc/hosts / backup of any important services configs /etc/haproxy/haproxy.cfg /etc/postfix/postfix.cfg (like it was my case), preferrably backup of whole /etc/  any important files from /root/ or /home/users* directories backup of at leasts latest logs from /var/log etc.
 

2. Clear up all unnecessery services scripts from the server

Any additional Softwares / Services and integrity checking tools (daemons) / scripts and cron jobs, were immediately stopped and wheter unused removed.

E.g. we had moved through /etc/cron* to check what's there,

# ls -ld /etc/cron.*
drwx—— 2 root root 4096 Feb  7 18:13 /etc/cron.d
drwxr-xr-x 2 root root 4096 Feb  7 17:59 /etc/cron.daily
-rw-r–r– 1 root root    0 Jul 20  2010 /etc/cron.deny
drwxr-xr-x 2 root root 4096 Jan  9  2013 /etc/cron.hourly
drwxr-xr-x 2 root root 4096 Jan  9  2013 /etc/cron.monthly
drwxr-xr-x 2 root root 4096 Aug 26  2015 /etc/cron.weekly

 

And like well professional butchers removed everything unnecessery that could trigger any extra unnecessery disk read / writes to HDD.

E.g. just create

# mkdir -p /root/etc_old/{/etc/cron.d,\
/etc/cron.daily,/etc/cron.hourly,/etc/cron.monthly\
,/etc/cron.weekly}

 

And moved all unnecessery cron job scripts like:

1. nmon (old school network / memory / hard disk console tool for monitoring and tuning server parameters)
2. clamscan / freshclam crons
3. mlocate (the script that is taking care for periodic run of updatedb command to keep the locate command to easily search
for files inside the DB to put less read operations on disk in case if you need to find file (e.g. prevent yourself to everytime
run cmd like: find / . -iname '*whatever_you_look_for*'
4. cups cron jobs
5. logwatch cron
6. rkhunter stuff
7. logrotate (yes we stopped even logrotation trigger job as we found the server was crashing sometimes at the same time when
the lograte job to rotate logs inside /var/log/* was running perhaps leading to a hit of the I/O read error (bad blocks).


Also inspected the Administrator user root cron job for any unwated scripts and stopped two report bash scripts that were part of the PCI tightened Security procedures.
Therein found script responsible to periodically report the list of installed packages and if they have not changed, as well a script to periodically report via email the list of
/etc/{passwd,/etc/shadow} created users, used to historically keep an eye on the list of users and easily see if someone
has created new users on the machine. Those were enabled via /var/spool/cron/root cron jobs, in other cases, on other machines if it happens for you
it is a good idea to check out all the existing user cron jobs and stop anything that might be putting Read / Write extra heat pressure on machine attached the Hard drives.

# ls -al /var/spool/cron/
total 20
drwx——  2 root root 4096 Nov 13  2015 .
drwxr-xr-x 12 root root 4096 May 11  2011 ..
-rw——-  1 root root  133 Nov 13  2015 root


3. Clear up old log files and any files unnecessery

Under /var/log and /home /var/tmp /var/spool/tmp immediately try to clear up the old log files.
From my past experience this has many times made the FS file inodes that are storing on a unbroken part (good blocks) of the hard drive and
ready to be reused by newly written rsyslog / syslogd services spitted files.

!!! Note that during the removal of some files you might hit a files stored on a bad blocks that might lead to a unexpected system reboot.

But that's okay, don't worry most likely after a hard reset by a technician in the Datacenter the machine will boot again and you can enjoy
removing remaining still files to send them to the heaven for old files.

 

4. Trigger an automatic system file system check with fsck on next boot

The standard way to force a Linux to aumatically recheck its Root filesystem is to simply create the /forcefsck to root partition or any other secondary disk partition you would like to check.

# touch /forcefsck

# reboot


However at some occasions you might be unable to do it because, the / (root fs) has been remounted in ReadOnly mode, yackes …

Luckily old Linux distibutions like this RHEL 5.1, has a way to force a filesystem check after reboot fsck and identify any
unknown bad-blocks and hopefully succceed in isolating them, so you don't hit into the same auto-reboots if the hard drive or Software / Hardware RAID
is not in terrible state
, you can use an option built in in /sbin/shutdown command the '-F'

   -F     Force fsck on reboot.


Hence to make the machine reboot and trigger immediately fsck:

# shutdown -rF now


Just In case you wonder why to reboot before check the Filesystem. Well simply because you need to have them unmounted before you check.

In that specific case this produced so far a good result and the machine booted just fine and we crossed the fingers and prayed that the machine would work flawlessly in the coming few weeks, before we finalize the configuration of the substitute machines, where this old infrastructure will be migrated to a new built cluster with new Haproxy and Corosync / Pacemaker Cluster on a brand new RHEL.

NB! On newer machines this won't work however as shutdown command has been stripped off this option because no SystemV (SystemInit) or Upstart and not on SystemD newer services architecture.
 

5. Hints on checking the hard drives with fsck

If you happen to be able to have physical access to the remote Hardare machine via a TTY[1-9] Console, that's even better and is the standard way to do it but with this specific case we had no easy way to get access to the Physical server console.

It is even better to go there and via either via connected Monitor (Display) or KVM Switch (Those who hear KVM switch first time this is a great device in server rooms to connect multiple monitors to same Monitor Display), it is better to use a some of the multitude of options to choose from for USB Distro Linux recovery OS versions or a CDROM / DVD on older machines like this with the Redhat's recovery mode rolled on.
After mounting the partition simply check each of the disks
e.g. :

# fsck -y /dev/sdb
# fsck -y /dev/sdc

Or if you want to not waste time and look for each hard drive but directly check all the ones that are attached and known by Linux distro via /etc/fstab definition run:

# fsck -AR

If necessery and you have a mixture of filesystems for example EXT3 , EXT4 , REISERFS you can tell it to omit some filesystem, for example ext3, like that:

# fsck -AR -t noext3 -y


To skip fsck on mounted partitions with fsck:

# fsck -M /dev/sdb


One remark to make here on fsck is usually fsck to complete its job on various filesystem it uses other external component binaries usually stored in /sbin/fsck*

ls -al /sbin/fsck*
-rwxr-xr-x 1 root root  55576 20 яну 2022 /sbin/fsck*
-rwxr-xr-x 1 root root  43272 20 яну 2022 /sbin/fsck.cramfs*
lrwxrwxrwx 1 root root      9  4 юли 2020 /sbin/fsck.exfat -> exfatfsck*
lrwxrwxrwx 1 root root      6  7 юни 2021 /sbin/fsck.ext2 -> e2fsck*
lrwxrwxrwx 1 root root      6  7 юни 2021 /sbin/fsck.ext3 -> e2fsck*
lrwxrwxrwx 1 root root      6  7 юни 2021 /sbin/fsck.ext4 -> e2fsck*
-rwxr-xr-x 1 root root  84208  8 фев 2021 /sbin/fsck.fat*
-rwxr-xr-x 2 root root 393040 30 ное 2009 /sbin/fsck.jfs*
-rwxr-xr-x 1 root root 125184 20 яну 2022 /sbin/fsck.minix*
lrwxrwxrwx 1 root root      8  8 фев 2021 /sbin/fsck.msdos -> fsck.fat*
-rwxr-xr-x 1 root root    333 16 дек 2021 /sbin/fsck.nfs*
lrwxrwxrwx 1 root root      8  8 фев 2021 /sbin/fsck.vfat -> fsck.fat*


6. Using tune2fs to  adjust tunable filesystem parameters on ext2/ext3/ext4 filesystems (few examples)

a) To check whether really the filesystem was checked on boot time or check a random filesystem on the server for its last check up date with fsck:

#  tune2fs -l /dev/sda1 | grep checked
Last checked:             Wed Apr 17 11:04:44 2019

On some distributions like old Debian and Ubuntu, it is even possible to enable fsck to log its operations during check on reboot via changing the verbosity from NO to YES:

# sed -i "s/#VERBOSE=no/VERBOSE=yes/" /etc/default/rcS


If you're having the issues on old Debian Linuxes  and not on RHEL  it is possible to;

b) Enable all fsck repairs automatic on boot

by running via:
 

# sed -i "s/FSCKFIX=no/FSCKFIX=yes/" /etc/default/rcS


c) Forcing fcsk check on for server attached Hard Drive Partitions with tune2fs

# tune2fs -c 1 /dev/sdXY

Note that:
tune2fs can force a fsck on each reboot for EXT4, EXT3 and EXT2 filesystems only.

tune2fs can trigger a forced fsck on every reboot using the -c (max-mount-counts) option.
This option sets the number of mounts after which the filesystem will be checked, so setting it to 1 will run fsck each time the computer boots.
Setting it to -1 or 0 resets this (the number of times the filesystem is mounted will be disregarded by e2fsck and the kernel).


 For example you could:

d) Set fsck to run a filesystem check every 30 boots, by using -c 30 
 

# tune2fs -c 30 /dev/sdXY


e) Checking whether a Hard Drive has been really checked on the boot

 

#  tune2fs -l /dev/sda1 | grep checked
Last checked:             Wed Apr 17 11:04:44 2019


e) Check when was the last time the file system /dev/sdX was checked:
 

# tune2fs -l /dev/sdX | grep Last\ c
Last checked:             Thu Jan 12 20:28:34 2017


f) Check how many times our /dev/sdX filesystem was mounted

# tune2fs -l /dev/sdX | grep Mount
Mount count:              157

g) Check how many mounts are allowed to pass before filesystem check is forced
 

# tune2fs -l /dev/sdX | grep Max
Maximum mount count:      -1


7. Repairing disk / partitions via GRUB fsck.mode and fsck.repair kernel module options

It is also possible to force a fsck.repair on boot via GRUB, but that usually is not an option someone would like as the machine might fail too boot if it hards to repair hardly, however in difficult situations with failing disks temporary enabling it is good idea.

This can be done by including for grub initial config

GRUB_CMDLINE_LINUX_DEFAULT="quiet splash fsck.mode=force fsck.repair=yes"

fsck.mode=force – will force a fsck each time a system boot and keeping that value enabled for a long time inside GRUB is stupid for servers as

sometimes booting could be severely prolonged because of the checks especially with servers with many or slow old hard drives.

fsck.repair=yes – will make the fsck try to repair if it finds bad blocks when checking (be absolutely sure you know, what you're doing if passing this options)

The options can be also set via editing the GRUB boot screen, if you have physical access to the server and don't want to reload the grub loader and possibly make the machine unbootable on next boot.
 

8. Few more details on how /etc/fstab disk fsck check parameters values for Systemd Linux machines works

The "proper" way on systemd (if we can talk about proper way on Linux) to runs fsck for each filesystem that has a fsck is to pass number greater than 0 set in
/etc/fstab (last column in /etc/fstab), so make sure you edit your /etc/fstab if that's not the case.

The root partition should be set to 1 (first to be checked), while other partitions you want to be checked should be set to 2.

Example /etc/fstab:
 

# /etc/fstab: static file system information.

/dev/sda1  /      ext4  errors=remount-ro  0  1
/dev/sda5  /home  ext4  defaults           0  2

The values you can put here as a second number meaning is as follows:
0 – disabled, that is do not check filesystem
1 – partition with this PASS value has a higher priority and is checked first. This value is usually set to the root / partition
2 – partitions with this PASS value will be checked last

a) Check the produced log out of fsck

Unfortunately on the older versions of Linux distros with SystemV fsck log output might be not generated except on the physical console so if you have a kind of duplicator device physical tty on the display port of the server, you might capture some bad block reports or fixed errors messages, but if you don't you might just cross the fingers and hope that anything found FS irregularities was recovered.

On systemd Linux machines the fsck log should be produced either in /run/initramfs/fsck.log or some other location depending on the Linux distro and you should be able to see something from fsck inside /var/log/* logs:

# grep -rli fsck /var/log/*


Close it up

Having a system with failing disk is a really one of the worst sysadmin nightmares to get. The good news is that most of the cases we're prepared with some working backup or some work around stuff like the few steps explained to mitigate the amount of Read / Writes to hard disks on the failing machine HDDs. If the failing disk is a primary Linux filesystem all becomes even worse as every next reboot, you have no guarantee, whether the kernel / initrd or some of the other system components required to run the Core Linux system won't break up the normal boot. Thus one side changes on the hard drives is a risky business on ther other side, if you're in a situation where you have a mirror system or the failing system is just a Linux server installed without a Cluster pair, then this is not a big deal as you can guarantee at least one of the nodes still up, unning and serving. Still doing too much of operations with HDD is always a danger so the steps described, though in most cases leading to improvement on how the system behaves, the system should be considered totally unreliable and closely monitored not only by some monitoring stuff like Zabbix / Prometheus whatever but regularly check the systems state via normal SSH logins. It is important if you have some important datas or logs on the system that are not synchronized to a system node to copy them before doing any of the described operations. After all minimal is backuped, proceed to clear up everything that might be cleared up and still the machine to continue providing most of its functionalities, trigger fsck automatic HDD check on next reboot, reboot, check what is going on and monitor the machine from there on.

Hopefully the few described steps, has helped some sysadmin. There is plenty of things which I've described that might go wrong, even following the described steps, might not help if the machines Storage Drives / SAS / SSD has too much of a damage. But as said in most cases following this few steps would improve the machine state.

Wish you the best of luck!

 

How to improve Linux kernel security with GrSecurity / Maximum Linux kernel security with GrSecurity

Tuesday, May 3rd, 2011

In short I’ll explain here what is Grsecurity http://www.grsecurity.net/ for all those who have not used it yet and what kind of capabilities concerning enhanced kernel security it has.

Grsecurity is a combination of patches for the Linux kernel accenting at the improving kernel security.

The typical application of GrSecurity is in the field of Linux systems which are administered through SSH/Shell, e.g. (remote hosts), though you can also configure grsecurity on a normal Linux desktop system if you want a super secured Linux desktop ;).

GrSecurity is used heavily to protect server system which require a multiple users to have access to the shell.

On systems where multiple user access is required it’s a well known fact that (malicious users, crackers or dumb script kiddies) get administrator (root) privileges with a some just poped in 0 day root kernel exploit.
If you’re an administrator of a system (let’s say a web hosting) server with multiple users having access to the shell it’s also common that exploits aiming at hanging in certain daemon service is executed by some of the users.
In other occasions you have users which are trying to DoS the server with some 0 day Denial of Service exploit.
In all this cases GrSecurity having a kernel with grsecurity is priceless.

Installing grsecurity patched kernel is an easy task for Debian and Ubuntu and is explained in one of my previous articles.
This article aims to explain in short some configuration options for a GrSecurity tightened kernel, when one have to compile a new kernel from source.

I would skip the details on how to compile the kernel and simply show you some picture screens with GrSecurity configuration options which are working well and needs to be set-up before a make command is issued to compile the new kernel.

After preparing the kernel source for compilation and issuing:

linux:/usr/src/kernel-source$ make menuconfig

You will have to select options like the ones you see in the pictures below:

[nggallery id=”8″]

After completing and saving your kernel config file, continue as usual with an ordinary kernel compilation, e.g.:

linux:/usr/src/kernel-source$ make
linux:/usr/src/kernel-source$ make modules
linux:/usr/src/kernel-source$ su root
linux:/usr/src/kernel-source# make modules_install
linux:/usr/src/kernel-source# make install
linux:/usr/src/kernel-source# mkinitrd -o initrd.img-2.6.xx 2.6.xx

Also make sure the grub is properly configured to load the newly compiled and installed kernel.

After a system reboot, if all is fine you should be able to boot up the grsecurity tightened newly compiled kernel, but be careful and make sure you have a backup solution before you reboot, don’t blame me if your new grsecurity patched kernel fails to boot! You’re on your own boy 😉
This article is written thanks to based originally on his article in Bulgarian. If you’re a Bulgarian you might also checkout static’s blog

How to tune MySQL Server to increase MySQL performance using mysqltuner.pl and Tuning-primer.sh

Tuesday, August 9th, 2011

MySQL Easy performance tuning with mysqltuner.pl and Tuning-primer.sh scripts

Improving MySQL performance is crucial for improving a website responce times, reduce server load and improve overall work efficiency of a mysql database server.

I’ve seen however many Linux System administrators who does belittle or completely miss the significance of tuning a newly installed MySQL server installation.
The reason behind that is probably caused by fact that many people think MySQL config variables, would not significantly improve performance and does not pay back for optimization efforts. Moreover there are a bunch of system admins who has to take care for numerous services so they don’t have time to get good knowledge to optimize MySQL servers.
Thus many admins and webmasters nowdays, think optimizations depend mostly on the side of the website programmers.
It’s also sometimes falsely believed that optimizing a MySQL server could reduce the overall server stability.

With the boom of Internet website building and internet marketing, many webmasters emerged and almost anybody with almost no knowledge on GNU/Linux or minimal or no knowledge on PHP can start his Online store, open a blog or create a website powered by some CMS like joomla.
Thus nowdays many servers even doesn’t have a hired system administrators but are managed by people whose knowledge on *Nix is almost next to zero, this is another reason why dozens of MySQL installations online are a default ones and are not taking a good advantage of the server hardware.

The incrase of website visitors leads people servers expectations for hardware also to grow, thus many companies simply buy a new hardware instead of taking the few time to investigate on how current server hardware can be utilized better.
In that manner of thought I though it will be a good idea to write this small article on Tuning mysql servers with two scripts Tuning-primer.sh and mysqltuner.pl.
The scripts are ultra easy to use and does not require only a minimal knowledge on MySQL, Linux or (*BSD *nix if sql is running on BSD).
Tuning-primer.sh and mysqltuner.pl are therefore suitable for a quick MySQL server optimizations to even people who are no computer experts.

I use this two scripts for MySQL server optimizations on almost every new configured GNU/Linux with a MySQL backend.
Use of the script comes to simply download with wget, lynx, curl or some other web client and execute it on the server host which is already running the MySQL server.

Here is an example of how simple it is to run the scripts to Optimize MySQL:

debian:~# perl mysqltuner.pl
>> MySQLTuner 1.2.0 - Major Hayden >major@mhtx.net<
>> Bug reports, feature requests, and downloads at http://mysqltuner.com/
>> Run with '--help' for additional options and output filtering

——– General Statistics ————————————————–
[–] Skipped version check for MySQLTuner script
[OK] Currently running supported MySQL version 5.1.49-3
[OK] Operating on 64-bit architecture

——– Storage Engine Statistics ——————————————-
[–] Status: +Archive -BDB -Federated +InnoDB -ISAM -NDBCluster
[–] Data in MyISAM tables: 6G (Tables: 952)
[!!] InnoDB is enabled but isn’t being used
[!!] Total fragmented tables: 12

——– Security Recommendations ——————————————-
[OK] All database users have passwords assigned

——– Performance Metrics ————————————————-
[–] Up for: 1d 2h 3m 35s (68M q [732.193 qps], 610K conn, TX: 49B, RX: 11B)
[–] Reads / Writes: 76% / 24%
[–] Total buffers: 512.0M global + 2.8M per thread (2000 max threads)
[OK] Maximum possible memory usage: 6.0G (25% of installed RAM)
[OK] Slow queries: 0% (3K/68M)
[OK] Highest usage of available connections: 7% (159/2000)
[OK] Key buffer size / total MyISAM indexes: 230.0M/1.7G
[OK] Key buffer hit rate: 97.8% (11B cached / 257M reads)
[OK] Query cache efficiency: 76.6% (46M cached / 61M selects)
[!!] Query cache prunes per day: 1822075
[OK] Sorts requiring temporary tables: 0% (1K temp sorts / 2M sorts)
[!!] Joins performed without indexes: 63635
[OK] Temporary tables created on disk: 1% (26K on disk / 2M total)
[OK] Thread cache hit rate: 99% (159 created / 610K connections)
[!!] Table cache hit rate: 4% (1K open / 43K opened)
[OK] Open file limit used: 17% (2K/16K)
[OK] Table locks acquired immediately: 99% (36M immediate / 36M locks)

——– Recommendations —————————————————–
General recommendations:
Add skip-innodb to MySQL configuration to disable InnoDB
Run OPTIMIZE TABLE to defragment tables for better performance
Enable the slow query log to troubleshoot bad queries
Increasing the query_cache size over 128M may reduce performance
Adjust your join queries to always utilize indexes
Increase table_cache gradually to avoid file descriptor limits
Variables to adjust:
query_cache_size (> 256M) [see warning above] join_buffer_size (> 256.0K, or always use indexes with joins) table_cache (> 7200)

You see there are plenty of things, the script reports, for the unexperienced most of the information can be happily skipped without need to know the cryptic output, the section of importance here is Recommendations for some clarity, I’ve made this section to show up bold.

The most imporant things from the Recommendations script output is actually the lines who give suggestions for incrase of certain variables for MySQL.In this example case this are the last three variables:
query_cache_size,
join_buffer_size and
table_cache

All of these variables are tuned from /etc/mysql/my.cnf (on Debian) and derivatives distros and from /etc/my.cnf on RHEL, CentOS, Fedora and the other RPM based Linux distributions.

On some custom server installs my.cnf is also located in /usr/local/mysql/etc/ or some other a bit more unstandard location 😉

Anyways now having the Recommendation from the script, it’s necessery to edit my.cnf and try to double the values for the suggested variables.

First, I check if all the suggested variables are existent in my config with grep , if they’re not then I’ll simply add the variable with doubled size of the suggested values.
P.S: One note here is sometimes some values which are configured, are the default value for the MySQL server and does not have a record in my.cnf

debian:~# grep -E 'query_cache_size|join_buffer_size|table_cache' /etc/mysql/my.cnf table_cache = 7200
query_cache_size = 256M
join_buffer_size = 262144

All of my variables are in the config so, now edit my.cnf and set values to:
table_cache = 14400
query_cache_size = 512M
join_buffer_size = 524288

I always, however preserve the old variable’s value, because sometimes raising the value might create problem and the MySql server might be unable to restart properly.
Thus before going with adding the new values make sure the old ones are commented with # , e.g.:
#table_cache = 7200
#query_cache_size = 256M
#join_buffer_size = 262144

I would recommend vim as editor of choice while editing my.cnf as vim completely rox 😉 If you’re not acquainted to vim use nano or mcedit or your editor of choice 😉 :

debian:~# vim /etc/mysql/my.cnf
...

Assuming that the changes are made, it’s time to restart MySQL to make sure the new values are read by the SQL server.

debian:~# /etc/init.d/mysql restart
* Stopping MySQL database server mysqld [ OK ]
* Starting MySQL database server mysqld [ OK ]
Checking for tables which need an upgrade, are corrupt or were not closed cleanly.

If mysql server fails, however to restart, make sure immediately you reverse back the changed variables to the commented values and restart once again via mysql init script to make server load.

Afterwards start adding the values one by one until find out which one is causing the mysqld to fail.

Now the second script (Tuning-primer.sh) is also really nice for MySQL performance optimizations are necessery. However it’s less portable (as it’s written in bash scripting language).
Consider running this script among different GNU/Linux distributious (especially the newer ones) might produce errors.
Tuning-primer.sh requires some minor code changes to be able to run on FreeBSD, NetBSD and OpenBSD *nices.

The way Tuning-primer.sh works is precisely like mysqltuner.pl , one runs it it gives some info about current running MySQL server and based on certain factors gives suggestions on how increasing or decresing certain my.cnf variables could reduce sql query bottlenecks, solve table locking issues as well as generally improve INSERT, UPDATE query times.

Here is an example output from tuning-primer.sh run on another server:

server:~# wget https://www.pc-freak.net/files/Tuning-primer.sh
...
server:~# sh Tuning-primer.sh
-- MYSQL PERFORMANCE TUNING PRIMER --
- By: Matthew Montgomery -

MySQL Version 5.0.51a-24+lenny5 x86_64

Uptime = 8 days 10 hrs 19 min 8 sec
Avg. qps = 179
Total Questions = 130851322
Threads Connected = 1

Server has been running for over 48hrs.
It should be safe to follow these recommendations

To find out more information on how each of these
runtime variables effects performance visit:
http://dev.mysql.com/doc/refman/5.0/en/server-system-variables.html

SLOW QUERIES
Current long_query_time = 1 sec.
You have 16498 out of 130851322 that take longer than 1 sec. to complete
The slow query log is NOT enabled.
Your long_query_time seems to be fine

MAX CONNECTIONS
Current max_connections = 2000
Current threads_connected = 1
Historic max_used_connections = 85
The number of used connections is 4% of the configured maximum.
Your max_connections variable seems to be fine.

WORKER THREADS
Current thread_cache_size = 128
Current threads_cached = 84
Current threads_per_sec = 0
Historic threads_per_sec = 0
Your thread_cache_size is fine

MEMORY USAGE
Tuning-primer.sh: line 994: let: expression expected
Max Memory Ever Allocated : 741 M
Configured Max Memory Limit : 5049 M
Total System Memory : 23640 M

KEY BUFFER
Current MyISAM index space = 1646 M
Current key_buffer_size = 476 M
Key cache miss rate is 1 / 56
Key buffer fill ratio = 90.00 %
You could increase key_buffer_size
It is safe to raise this up to 1/4 of total system memory;
assuming this is a dedicated database server.

QUERY CACHE
Query cache is enabled
Current query_cache_size = 64 M
Current query_cache_used = 38 M
Current Query cache fill ratio = 59.90 %

SORT OPERATIONS
Current sort_buffer_size = 2 M
Current record/read_rnd_buffer_size = 256.00 K
Sort buffer seems to be fine

JOINS
Current join_buffer_size = 128.00 K
You have had 111560 queries where a join could not use an index properly
You have had 91 joins without keys that check for key usage after each row
You should enable “log-queries-not-using-indexes”
Then look for non indexed joins in the slow query log.
If you are unable to optimize your queries you may want to increase your
join_buffer_size to accommodate larger joins in one pass.

TABLE CACHE
Current table_cache value = 3600 tables
You have a total of 798 tables
You have 1904 open tables.
The table_cache value seems to be fine

TEMP TABLES
Current tmp_table_size = 128 M
1% of tmp tables created were disk based
Created disk tmp tables ratio seems fine

TABLE SCANS
Current read_buffer_size = 128.00 K
Current table scan ratio = 797 : 1
read_buffer_size seems to be fine

TABLE LOCKING
Current Lock Wait ratio = 1 : 1782
You may benefit from selective use of InnoDB.

As seen from script output, there are certain variables which might be increased a bit for better SQL performance, one such variable as suggested is key_buffer_size(You could increase key_buffer_size)

Now the steps to make the tunings to my.cnf are precisely the same as with mysqltuner.pl, e.g.:
1. Preserve old config variables which will be changed by commenting them
2. Double value of current variables in my.cnf suggested by script
3. Restart Mysql server via /etc/init.d/mysql restart cmd.
4. If mysql runs fine monitor mysql performance with mtop or mytop for at least 15 mins / half an hour.

if all is fine run once again the tuning scripts to see if there are no further improvement suggestions, if there are more follow the 4 steps described procedure once again.

It’s also a good idea that these scripts are periodically re-run on the server like once per few months as changes in SQL queries amounts and types will require changes in MySQL operational variables.
The authors of these nice scripts has done great job and have saved us a tons of nerves time, downtimes and money spend on meaningless hardware. So big thanks for the awesome scripts guys 😉
Finally after hopefully succesful deployment of changes, enjoy the incresed SQL server performance 😉

Boost local network performance (Increase network thoroughput) by enabling Jumbo Frames on GNU / Linux

Saturday, March 10th, 2012

Jumbo Frames boost local network performance in GNU / Linux

So what is Jumbo Frames? and why, when and how it can increase the network thoroughput on Linux?

Jumbo Frames are Ethernet frames with more than 1500 bytes of payload. They can carry up to 9000 bytes of payload. Many Gigabit switches and network cards supports them.
Jumbo frames is a networking standard for many educational networks like AARNET. Unfortunately most commercial ISPs doesn't support them and therefore enabling Jumbo frames will rarely increase bandwidth thoroughput for information transfers over the internet.
Hopefully in the years to come with the constant increase of bandwidths and betterment of connectivity, jumbo frames package transfers will be supported by most ISPs as well.
Jumbo frames network support is just great for is small local – home networks and company / corporation office intranets.

Thus enabling Jumbo Frame is absolutely essential for "local" ethernet networks, where large file transfers occur frequently. Such networks are networks where, there is often a Video or Audio streaming with high quality like HD quality on servers running File Sharing services like Samba, local FTP sites,Webservers etc.

One other advantage of enabling jumbo frames is reduce of general server overhead and decrease in CPU load / (CPU usage), when transferring large or enormous sized files.Therefore having jumbo frames enabled on office network routers with GNU / Linux or any other *nix OS is vital.

Jumbo Frames traffic is supported in GNU / Linux kernel since version 2.6.17+ in earlier 2.4.x it was possible through external third party kernel patches.

1. Manually increase MTU to 9000 with ifconfig to enable Jumbo frames

debian:~# /sbin/ifconfig eth0 mtu 9000

The default MTU on most GNU / Linux (if not all) is 1500, to check the default set MTU with ifconfig:

linux:~# /sbin/ifconfig eth0|grep -i mtu
UP BROADCAST MULTICAST MTU:1500 Metric:1

To take advantage of Jumbo Frames, all that has to be done is increase the default Maximum Transmission Unit from 1500 to 9000

For those who don't know MTU is the largest physical packet size that can be transferred over the network. MTU is measured by default in bytes. If a information has to be transferred over the network which exceeds the lets say 1500 MTU (bytes), it will be chopped and transferred in few packs each of 1500 size.

MTUs differ on different netework topologies. Just for info here are the few main MTUs for main network types existing today:
 

  • 16 MBit/Sec Token Ring – default MTU (17914)
  • 4 Mbits/Sec Token Ring – default MTU (4464)
  • FDDI – default MTU (4352)
  • Ethernet – def MTU (1500)
  • IEEE 802.3/802.2 standard – def MTU (1492)
  • X.25 (dial up etc.) – def MTU (576)
  • Jumbo Frames – def max MTU (9000)

Setting the MTU packet frames to 9000 to enable Jumbo Frames is done with:

linux:~# /sbin/ifconfig eth0 mtu 9000

If the command returns nothing, this most likely means now the server can communicate on eth0 with MTUs of each 9000 and therefore the network thoroughput will be better. In other case, if the network card driver or card is not a gigabit one the cmd will return error:

SIOCSIFMTU: Invalid argument

2. Enabling Jumbo Frames on Debian / Ubuntu etc. "the Debian way"

a.) Jumbo Frames on ethernet interfaces with static IP address assigned Edit /etc/network/interfaces and you should have for each of the interfaces you would like to set the Jumbo Frames, records similar to:

Raising the MTU to 9000 if for one time can be done again manually with ifconfig

debian:~# /sbin/ifconfig eth0 mtu 9000

iface eth0 inet static
address 192.168.0.5
network 192.168.0.0
gateway 192.168.0.254
netmask 255.255.255.0
mtu 9000

For each of the interfaces (eth1, eth2 etc.), add a chunk similar to one above changing the changing the IPs, Gateway and Netmask.

If the server is with two gigabit cards (eth0, eth1) supporting Jumbo frames add to /etc/network/interfaces :

iface eth0 inet static
address 192.168.0.5
network 192.168.0.0
gateway 192.168.0.254
netmask 255.255.255.0
mtu 9000

iface eth1 inet static
address 192.168.0.6
network 192.168.0.0
gateway 192.168.0.254
netmask 255.255.255.0
mtu 9000

b.) Jumbo Frames on ethernet interfaces with dynamic IP obtained via DHCP

Again in /etc/network/interfaces put:

auto eth0
iface eth0 inet dhcp
post-up /sbin/ifconfig eth0 mtu 9000

3. Setting Jumbo Frames on Fedora / CentOS / RHEL "the Redhat way"

Enabling jumbo frames on all Gigabit lan interfaces (eth0, eth1, eth2 …) in Fedora / CentOS / RHEL is done through files:
 

  • /etc/sysconfig/network-script/ifcfg-eth0
  • /etc/sysconfig/network-script/ifcfg-eth1

etc. …
append in each one at the end of the respective config:

MTU=9000

[root@fedora ~]# echo 'MTU=9000' >> /etc/sysconfig/network-scripts/ifcfg-eth


a quick way to set Maximum Transmission Unit to 9000 for all network interfaces on on Redhat based distros is by executing the following loop:

[root@centos ~]# for i in $(echo /etc/sysconfig/network-scripts/ifcfg-eth*); do \echo 'MTU=9000' >> $i
done

P.S.: Be sure that all your interfaces are supporting MTU=9000, otherwise increase while the MTU setting is set will return SIOCSIFMTU: Invalid argument err.
The above loop is to be used only, in case you have a group of identical machines with Lan Cards supporting Gigabit networks and loaded kernel drivers supporting MTU up to 9000.

Some Intel and Realtek Gigabit cards supports only a maximum MTU of 7000, 7500 etc., so if you own a card like this check what is the max MTU the card supports and set it in the lan device configuration.
If increasing the MTU is done on remote server through SSH connection, be extremely cautious as restarting the network might leave your server inaccessible.

To check if each of the server interfaces are "Gigabit ready":

[root@centos ~]# /sbin/ethtool eth0|grep -i 1000BaseT
1000baseT/Half 1000baseT/Full
1000baseT/Half 1000baseT/Full

If you're 100% sure there will be no troubles with enabling MTU > 1500, initiate a network reload:

[root@centos ~]# /etc/init.d/network restart
...

4. Enable Jumbo Frames on Slackware Linux

To list the ethernet devices and check they are Gigabit ones issue:

bash-4.1# lspci | grep [Ee]ther
0c:00.0 Ethernet controller: D-Link System Inc Gigabit Ethernet Adapter (rev 11)
0c:01.0 Ethernet controller: D-Link System Inc Gigabit Ethernet Adapter (rev 11)

Setting up jumbo frames on Slackware Linux has two ways; the slackware way and the "universal" Linux way:

a.) the Slackware way

On Slackware Linux, all kind of network configurations are done in /etc/rc.d/rc.inet1.conf

Usual config for eth0 and eth1 interfaces looks like so:

# Config information for eth0:
IPADDR[0]="10.10.0.1"
NETMASK[0]="255.255.255.0"
USE_DHCP[0]=""
DHCP_HOSTNAME[1]=""
# Config information for eth1:
IPADDR[1]="10.1.1.1"
NETMASK[1]="255.255.255.0"
USE_DHCP[1]=""
DHCP_HOSTNAME[1]=""

To raise the MTU to 9000, the variables MTU[0]="9000" and MTU[1]="9000" has to be included after each interface config block, e.g.:

# Config information for eth0:
IPADDR[0]="172.16.1.1"
NETMASK[0]="255.255.255.0"
USE_DHCP[0]=""
DHCP_HOSTNAME[1]=""
MTU[0]="9000"
# Config information for eth1:
IPADDR[1]="10.1.1.1"
NETMASK[1]="255.255.255.0"
USE_DHCP[1]=""
DHCP_HOSTNAME[1]=""
MTU[1]="9000"

bash-4.1# /etc/rc.d/rc.inet1 restart
...

b.) The "Universal" Linux way

This way is working on most if not all Linux distributions.
Insert in /etc/rc.local:

/sbin/ifconfig eth0 mtu 9000 up
/sbin/ifconfig eth1 mtu 9000 up

5. Check if Jumbo Frames are properly enabled

There are at least two ways to display the MTU settings for eths.

a.) Using grepping the MTU from ifconfig

linux:~# /sbin/ifconfig eth0|grep -i mtu
UP BROADCAST MULTICAST MTU:9000 Metric:1
linux:~# /sbin/ifconfig eth1|grep -i mtu
UP BROADCAST MULTICAST MTU:9000 Metric:1

b.) Using ip command from iproute2 package to get MTU

linux:~# ip route get 192.168.2.134
local 192.168.2.134 dev lo src 192.168.2.134
cache mtu 9000 advmss 1460 hoplimit 64

linux:~# ip route show dev wlan0
192.168.2.0/24 proto kernel scope link src 192.168.2.134
default via 192.168.2.1

You see MTU is now set to 9000, so the two server lans, are now able to communicate with increased network thoroughput.
Enjoy the accelerated network transfers 😉

 

How to fix “imapd-ssl: Maximum connection limit reached for ::ffff:xxx.xxx.xxx.xxx” imapd-ssl error

Saturday, May 28th, 2011

One of the mail server clients is running into issues with secured SSL IMAP connections ( he has to use a multiple email accounts on the same computer).
I was informed that part of the email addresses are working correctly, however the newly created ones were failing to authenticate even though all the Outlook Express email configuration was correct as well as the username and password typed in were a real existing credentials on the vpopmail server.

Initially I thought, something is wrong with his newly configured emails but it seems all the settings were perfectly correct!

After a lot of wondering what might be wrong I was dumb enough not to check my imap log files.

After checking in my /var/log/mail.log which is the default log file I’ve configured for vpopmail and some of my qmail server services, I found the following error repeating again and again:

imapd-ssl: Maximum connection limit reached for ::ffff:xxx.xxx.xxx.xxx" imapd-ssl error

where xxx.xxx.xxx.xxx was the email user computer IP address.

This issues was caused by one of my configuration settings in the imapd-ssl and imap config file:

/usr/lib/courier-imap/etc/imapd

In /usr/lib/courier-imap/etc/imapd there is a config segment called
Maximum number of connections to accept from the same IP address

Right below this commented text is the variable:

MAXPERIP=4

As you can see it seems I used some very low value for the maximum number of connections from one and the same IP address.
I suppose my logic to set such a low value was my desire to protect the IMAP server from Denial of Service attacks, however 4 is really too low and causes problem, thus to solve the mail connection issues for the user I raised the MAXPERIP value to 50:

MAXPERIP=50

Now to force the new imapd and imapd-ssl services to reload it’s config I did a restart of the courier-imap, like so:

debian:~# /etc/init.d/courier-imap restart

That’s all now the error is gone and the client could easily configure up to 50 mailbox accounts on his PC 🙂

Feast of Saint Stephen’s martyrdom in Bulgarian Orthodox Church / Saint Stephen the first Christian Saint

Tuesday, December 27th, 2011

Saint Stephen Martyrdom Orthodox Christian Icon

It is 27th of December, the 3rd day of Christmas and in those day in the Bulgarian Orthodox Church we co-memorate saint Stephen's Martyrdom.

Its a well known fact that by his martyrdom Saint Stephen become the first Christian martyr
Stephen's name etymology comes from Greek (Stephanos) and translated means "Crown".

What we know from Orthodox Church's tradition is that Stephen was a very young in his age of Martyrdom probably in his 20s.

Stephen's glorious martyrdom has inspired and strengthened significantly the early seveerly hunted Church and has give a lot of courage and faith to many of our early Saint Martyrs. These early Church saints now incesently pray to God to have mercy on our earthly Church.
Its not much known fact St. Stephen held a rank in the early Church clergy, he was a HieroDeacon.
Hierodeaconism in our church is considered the maximum rank of deaconship one can have just before he is ordained for a Priest.

St. Stephen Orthodox Christian icon

St. Stephen was burning for love in Christ and had a strong faith in God, turning many Jewish to our Lord and Saviour Jesus Christ. He has shown the truth way to many by teaching them, how to properly interpret the Old Testament (Ancient Jewish) writtings.
He explained to many jewish, how all the old scriptures testify about the coming of the Messiah (Christ), which will save all who believe in his name from sins hell damnation.
As Jewish looked for ways to shutter any kind of preaching of the Gospel and kill Christianity, they saw Stephen as a big enemy just like all Christians in those early days. Therefore they looked for ways to accuse him and execute them and end up the large numbers of jewish people converting from Jewism to Christianity
The whole story of stoning (execution) of st. Stephen is described in the Holy Scriptures in the book called Acts of the Apostles

As he was about to die, Stephen looked up to heaven and said;

"Behold, I see the heavens opened, and the Son of man standing at the right hand of God."
Then, as he was being stoned, he cried out, "Lord Jesus, receive my spirit.".Our Beloved saints last words were, "Lord, do not hold this sin against them." manifesting his saintship and deep Christ love even during his last moments of earthly living.
St. Stephen is celebrated in almost all West Roman Catholic Church and in Anglicans.
Churches who co-mmemorate the great saint are located in all around the world.
St. Stephen is also considered a patron saint of Republika Sprska (Bosnia and Herzegovina)
Stephen's name later probably become the origin name for western's so popular Steven name).
Saint Stephen Orthodox Christian Icon 11th century byzantine icon
 

Saint Stephen's Byzantine Icon from the 11th Century

Let by the Holy Prayers of St. Stephen God have mercy on the Church and increase our faith and mercies to us sinners.