Posts Tagged ‘emergency’

How to solve “Incorrect key file for table ‘/tmp/#sql_9315.MYI’; try to repair it” mysql start up error

Saturday, April 28th, 2012

When a server hard disk scape gets filled its common that Apache returns empty (no content) pages…
This just happened in one server I administer. To restore the normal server operation I freed some space by deleting old obsolete backups.
Actually the whole reasons for this mess was an enormous backup files, which on the last monthly backup overfilled the disk empty space.

Though, I freed about 400GB of space on the the root filesystem and on a first glimpse the system had plenty of free hard drive space, still restarting the MySQL server refused to start up properly and spit error:

Incorrect key file for table '/tmp/#sql_9315.MYI'; try to repair it" mysql start up error

Besides that there have been corrupted (crashed) tables, which reported next to above error.
Checking in /tmp/#sql_9315.MYI, I couldn't see any MYI – (MyISAM) format file. A quick google look up revealed that this error is caused by not enough disk space. This was puzzling as I can see both /var and / partitions had plenty of space so this shouldn't be a problem. Also manally creating the file /tmp/#sql_9315.MYI with:

server:~# touch /tmp/#sql_9315.MYI

Didn't help it, though the file created fine. Anyways a bit of a closer examination I've noticed a /tmp filesystem mounted besides with the other file system mounts ????
You can guess my great amazement to find this 1 Megabyte only /tmp filesystem hanging on the server mounted on the server.

I didn't mounted this 1 Megabyte filesystem, so it was either an intruder or some kind of "weird" bug…
I digged in Googling to see, if I can find more on the error and found actually the whole mess with this 1 mb mounted /tmp partition is caused by, just recently introduced Debian init script /etc/init.d/mountoverflowtmp.
It seems this script was introduced in Debian newer releases. mountoverflowtmp is some kind of emergency script, which is triggered in case if the root filesystem/ space gets filled.
The script has only two options:

# /etc/init.d/mountoverflowtmp
Usage: mountoverflowtmp [start|stop]

Once started what it does it remounts the /tmp to be 1 megabyte in size and stops its execution like it never run. Well maybe, the developers had something in mind with introducing this script I will not argue. What I should complain though is the script design is completely broken. Once the script gets "activated" and does its job. This 1MB mount stays like this, even if hard disk space is freed on the root partition – / ….

Hence to cope with this unhandy situation, once I had freed disk space on the root partition for some reason mountoverflowtmp stop option was not working,
So I had to initiate "hard" unmount:

server:~# mount -l /tmp

Also as I had a bunch of crashed tables and to fix them, also issued on each of the broken tables reported on /etc/init.d/mysql start start-up.

server:~# mysql -u root -p
mysql> use Database_Name;
mysql> repair table Table_Name extended;
....

Then to finally solve the stupid Incorrect key file for table '/tmp/#sql_XXYYZZ33444.MYI'; try to repair it error, I had to restart once again the SQL server:

Stopping MySQL database server: mysqld.
Starting MySQL database server: mysqld.
Checking for corrupt, not cleanly closed and upgrade needing tables..
root@server:/etc/init.d#

Tadadadadam!, SQL now loads and works back as before!

How to suspend Cpanel user through ssh / Cpanel shell command to suspend users

Friday, August 5th, 2011

One of the servers running Cpanel has been suspended today and the Data Center decided to completely bring down our server and gave us access to it only through rescue mode running linux livecd.

Thus I had no way to access the Cpanel web interface to suspend the “hacker” who by the way was running a number of instances of this old Romanian script kiddies brute force ssh scanner called sshscan .

Thanksfully Cpanel is equipped with a number of handy scripts for emergency situations in /scripts directory. These shell management scripts are awesome for situations like this one, where no web access is not avaiable.

To suspend the abuser / (abusive user ) I had to issue the command:

root@rescue [/]# /scripts/suspendacct abuse_user
Changing Shell to /bin/false...chsh: Unknown user context is not authorized to change the shell of abuse_user
Done
Locking Password...Locking password for user abuse_user.
passwd: Success
Done
Suspending mysql users
warn [suspendmysqlusers] abuse_user has no databases.
Notification => reports@santrex.net via EMAIL [level => 3]
Account previously suspended (password was locked).
/bin/df: `/proc/sys/fs/binfmt_misc': No such file or directory
Using Universal Quota Support (quota=0)
Suspended document root /home/abuse_user/public_html
Suspended document root /home/abuse_user/public_html/updateverificationonline.com
Using Universal Quota Support (quota=0)
Updating ftp passwords for abuse_user
Ftp password files updated.
Ftp vhost passwords synced
abuse_user's account has been suspended

That’s all now the user is suspended, so hopefully the DC will bring the server online in few minutes.

How to automatically reboot (restart) Debian GNU Lenny / Squeeze Linux on kernel panic, some general CPU overload or system crash

Monday, June 21st, 2010

If you are a system administrator, you have probably wondered at least once ohw to configure your Linux server to automatically reboot itself if it crashes, is going through a mass CPU overload, e.g. the server load average “hits the sky”.
I just learned from a nice article found here that there is a kernel variable which when enabled takes care to automatically restart a crashed server with the terrible Kernel Panic message we all know.

The variable I’m taking about is kernel.panic for instance kernel.panic = 20 would instruct your GNU Linux kernel to automatically reboot if it experiences a kernel panic system crash within a time limit of 20 seconds.

To start using the auto-reboot linux capabilities on a kernel panic occurance just set the variable to /etc/sysctl.conf

debian-server:~# echo 'kernel.panic = 20' >> /etc/sysctl.conf

Now we will also have to enable the variable to start being use on the system, so execute:

debian-server:~# sysctl -p There you go automatic system reboots on kernel panics is now on.
Now to further assure yourself the linux server you’re responsible of will automatically restart itself on a emergency situation like a system overload I suggest you check Watchdog

You might consider checking out this auto reboot tutorial which explains in simple words how watchdog is installed and configured.
On Debian installing and maintaining watchdog is really simple and comes to installing and enabling the watchdog system service, right afteryou made two changes in it’s configuration file /etc/watchdog.conf

To do so execute:

debian-server:~# apt-get install watchdog
debian-server:~# echo "file = /var/log/messages" >> /etc/watchdog.conf
debian-server:~# echo "watchdog-device = /dev/watchdog" >> /etc/watchdog.conf

Well that should be it, you might also need to load some kernel module to monitor your watchdog.
On my system the kernel modules related to watchdog are located in:

/lib/modules/2.6.26-2-amd64/kernel/drivers/watchdog/
If not then you should certainly try the software watchdog linux kernel module called softdog , to do so issue:
debian-server:~# /sbin/modprobe softdog

It’s best if you load the module while the softdog daemon is disabled.
If you consider auto loadig the softdog software watchdog kernel driver you should exec:

debian-server:~# echo 'softdog' >> /etc/modules

Finally a start of the watchdog is necessery:

 


debian-server:~# /etc/init.d/watchdog start
Stopping watchdog keepalive daemon....
Starting watchdog daemon....

That should be all your automatic system reboots should be now on! 🙂

How to reboot remotely Linux server if reboot, shutdown and init commands are not working (/sbin/reboot: Input/output error) – Reboot Linux in emergency using MagicSysRQ kernel sysctl variable

Saturday, July 23rd, 2011

SysRQ an alternative way to restart unrestartable Linux server

I’ve been in a situation today, where one Linux server’s hard drive SCSI driver or the physical drive is starting to break off where in dmesg kernel log, I can see a lot of errors like:

[178071.998440] sd 0:0:0:0: [sda] Result: hostbyte=DID_BAD_TARGET driverbyte=DRIVER_OK,SUGGEST_OK
[178071.998440] end_request: I/O error, dev sda, sector 89615868

I tried a number of things to remount the hdd which was throwing out errors in read only mode, but almost all commands I typed on the server were either shown as missng or returning an error:
Input/output error

Just ot give you an idea what I mean, here is a paste from the shell:

linux-server:/# vim /etc/fstab
-bash: vim: command not found
linux-server:/# vi /etc/fstab
-bash: vi: command not found
linux-server:/# mcedit /etc/fstab
-bash: /usr/bin/mcedit: Input/output error
linux-server:/# fdisk -l
-bash: /sbin/fdisk: Input/output error

After I’ve tried all kind of things to try to diagnose the server and all seemed failing, I thought next a reboot might help as on server boot the filesystems will get checked with fsck and fsck might be able to fix (at least temporary) the mess.

I went on and tried to restart the system, and guess what? I got:

/sbin/reboot init Input/output error

I hoped that at least /sbin/shutdown or /sbin/init commands might work out and since I couldn’t use the reboot command I tried this two as well just to get once again:

linux-server:/# shutdown -r now
bash: /sbin/shutdown: Input/output error
linux-server:/# init 6
bash: /sbin/init: Input/output error

You see now the situation was not pinky, it seemed there was no way to reboot the system …
Moreover the server is located in remote Data Center and I the tech support there is conducting assigned task with the speed of a turtle.
The server had no remote reboot, web front end or anything and thefore I needed desperately a way to be able to restart the machine.

A bit of research on the issue has led me to other people who experienced the /sbin/reboot init Input/output error error mostly caused by servers with failing hard drives as well as due to HDD control driver bugs in the Linux kernel.

As I was looking for another alternative way to reboot my Linux machine in hope this would help. I came across a blog post Rebooting the Magic Wayhttp://www.linuxjournal.com/content/rebooting-magic-way

As it was suggested in Cory’s blog a nice alternative way to restart a Linux machine without using reboot, shutdown or init cmds is through a reboot with the Magic SysRQ key combination

The only condition for the Magic SysRQ key to work is to have enabled the SysRQ – CONFIG_MAGIC_SYSRQ in Kernel compile time.
As of today luckily SysRQ Magic key is compiled and enabled by default in almost all modern day Linux distributions in this numbers Debian, Fedora and their derivative distributions.

To use the sysrq kernel capabilities as a mean to restart the server, it’s necessery first to activate the sysrq through sysctl, like so:

linux-server:~# sysctl -w kernel.sysrq=1
kernel.sysrq = 1

I found enabling the kernel.sysrq = 1 permanently in the kernel is also quite a good idea, to achieve that I used:

echo 'kernel.sysrq = 1' >> /etc/sysctl.conf

Next it’s wise to use the sync command to sync any opened files on the server as well stopping as much of the server active running services (MySQL, Apache etc.).

linux-server:~# sync

Now to reboot the Linux server, I used the /proc Linux virtual filesystem by issuing:

linux-server:~# echo b > /proc/sysrq-trigger

Using the echo b > /proc/sysrq-trigger simulates a keyboard key press which does invoke the Magic SysRQ kernel capabilities and hence instructs the kernel to immediately reboot the system.
However one should be careful with using the sysrq-trigger because it’s not a complete substitute for /sbin/reboot or /sbin/shutdown -r commands.
One major difference between the standard way to reboot via /sbin/reboot is that reboot kills all the running processes on the Linux machine and attempts to unmount all filesystems, before it proceeds to sending the kernel reboot instruction.

Using echo b > /proc/sysrq-trigger, however neither tries to umount mounted filesystems nor tries to kill all processes and sync the filesystem, so on a heavy loaded (SQL data critical) server, its use might create enormous problems and lead to severe data loss!

SO BEWARE be sure you know what you’re doing before you proceed using /proc/sysrq-trigger as a way to reboot ;).