Posts Tagged ‘Filesystem Size Used Avail Use Mounted’

No space left on device with free disk space / Why no space left on device while there is plenty of disk space on drive – Running out of Inodes

Tuesday, November 17th, 2015

no_space_left-on-device-while-there-is-disk-space-running-out-of-file-inodes-unix_linux_file_system_diagram.gif

 

On one of the servers, I'm administrating the websites started showing some Mysql database table corrup errors like:
 

 

Table './database_name/site_news_list_com' is marked as crashed and last (automatic?) repair failed

The server is using Oracle MySQL server community stable edition on Debian GNU / Linux 6.0, so I first thought during work the server crashed either due to some bug issue in MySQL or it crashed due to some PHP cron job that did something messy. Thus to solve the crashed tables, tried using mysqlcheck tool which helped pretty fine, at many times whether there were database / table corruptions. I've run the following set of mysqlcheck commands with root (superuser) in a bash shell after logging in through SSH:

:

server:~# /usr/bin/mysqlcheck –defaults-extra-file=/etc/mysql/debian.cnf \–check –all-databases -u root -p`grep -i password /root/.my.cnf |sed -e 's#password=##g'`>> /var/log/cronwork.log
server:~# /usr/bin/mysqlcheck –defaults-extra-file=/etc/mysql/debian.cnf –analyze –all-databases -u root -p`grep -i password /root/.my.cnf |sed -e 's#password=##g'`>> /var/log/cronwork.log
server:~# /usr/bin/mysqlcheck –defaults-extra-file=/etc/mysql/debian.cnf \–auto-repair –optimize –all-databases -u root -p`grep -i password /root/.my.cnf |sed -e 's#password=##g'`>> /var/log/cronwork.log
server:~# /usr/bin/mysqlcheck –defaults-extra-file=/etc/mysql/debian.cnf \–optimize –all-databases -u root -p`grep -i password /root/.my.cnf |sed -e 's#password=##g'`>> /var/log/cronwork.log


In order for above commands to work, I've created the /root/.my.cnf containing my root (mysql CLI) mysql username and password, e.g. file has content like below:

 

[client]
user=root
password=MySecretPassword8821238

 

Btw a good note here is its generally a good idea (if you want to have consistent mysql databases) to automatically execute via a cron job 2 times a month, I've in root cronjob the following:

 

crontab -u root -l |grep -i mysqlcheck
04 06 5,10,15,20,25,1 * * /usr/bin/mysqlcheck –defaults-extra-file=/etc/mysql/debian.cnf \–check –all-databases –silent -u root -p`grep -i password /root/.my.cnf |sed -e 's#password=##g'`>> /var/log/cronwork.log 07 06 5,10,15,20,25,1 * * /usr/bin/mysqlcheck –defaults-extra-file=/etc/mysql/debian.cnf –analyze –all-databases –silent -u root -p`grep -i password /root/.my.cnf |sed -e 's#password=##g'`>> /var/log/cronwork.log 12 06 5,10,15,20,25,1 * * /usr/bin/mysqlcheck –defaults-extra-file=/etc/mysql/debian.cnf \–auto-repair –optimize –all-databases –silent -u root -p`grep -i password /root/.my.cnf |sed -e 's#password=##g'`>> /var/log/cronwork.log 17 06 5,10,15,20,25,1 * * /usr/bin/mysqlcheck –defaults-extra-file=/etc/mysql/debian.cnf \–optimize –all-databases –silent -u root -p`grep -i password /root/.my.cnf |sed -e 's#password=##g'`>> /var/log/cronwork.log


Strangely I got a lot of errors that some .MYI / .MYD .frm temp files, necessery for the mysql tables recovery can't be written inside /home/mysql/database_name

That was pretty weird and I thought there might be some issues with permissions, causing the inability to write, due to some bug or something so I went straight and checked /home/mysql/database_name permissions, e.g.::

 

server:/home/mysql/database_name# ls -ld soccerfame
drwx—— 2 mysql mysql 36864 Nov 17 12:00 soccerfame
server:/home/mysql/database_name# ls -al1|head -n 10
total 1979012
drwx—— 2 mysql mysql 36864 Nov 17 12:00 .
drwx—— 36 mysql mysql 4096 Nov 17 11:12 ..
-rw-rw—- 1 mysql mysql 8712 Nov 17 10:26 1_campaigns_diez.frm
-rw-rw—- 1 mysql mysql 14672 Jul 8 18:57 1_campaigns_diez.MYD
-rw-rw—- 1 mysql mysql 1024 Nov 17 11:38 1_campaigns_diez.MYI
-rw-rw—- 1 mysql mysql 8938 Nov 17 10:26 1_campaigns.frm
-rw-rw—- 1 mysql mysql 8738 Nov 17 10:26 1_campaigns_logs.frm
-rw-rw—- 1 mysql mysql 883404 Nov 16 22:01 1_campaigns_logs.MYD
-rw-rw—- 1 mysql mysql 330752 Nov 17 11:38 1_campaigns_logs.MYI


As seen from above output, all was perfect with permissions, so it should have been something else, so I decided to try to create a random file with touch command inside /home/mysql/database_name directory:

 

touch /home/mysql/database_name/somefile-to-test-writtability.txt touch: cannot touch ‘/scr1/data/somefile-to-test-writtability.txt‘: No space left on device


Then logically I thought the /home/mysql/ mounted ext4 partition got filled, because of crashed SQL database or a bug thus, checked with disk free command df whether there is enough space on server:

server:~# df -h
Filesystem Size Used Avail Use% Mounted on
/dev/md1 20G 7.6G 11G 42% /
udev 10M 0 10M 0% /dev
tmpfs 13G 1.3G 12G 10% /run
tmpfs 32G 0 32G 0% /dev/shm
tmpfs 5.0M 0 5.0M 0% /run/lock
tmpfs 32G 0 32G 0% /sys/fs/cgroup
/dev/md2 256G 134G 110G 55% /home

Well that's weird? Obviously only 55% of available disk space is used and available 134G which was more than enough so I got totally puzzled why, files can't be written.

Then very logically, I thought it might be that /home directory has remounted as read only, because the SSD memory disk on server is failing and checked for errors in dmesg, i.e.:

 

server:~# dmesg|grep -i error


Also checked how exactly was partition mounted, to check whether it is (RO) read-only:

 

server:~# mount -l|grep -i /home
/dev/md2 on /home type ext4 (rw,relatime,discard,data=ordered)


Now everything become even more weirder, as obviously the disk continued to be claiming no space left on device, while in reality there was plenty of disk space.

Then after running a quick research on the internet for the no space left on device with free disk space, I've come across this great superuser.com thread which let me realize the partition run out of inodes and that's why no new file inodes could be assigned and therefore, the linux kernel is refusing to write the file on ext4 partition.

For those who haven't heard of Linux Partition Inodes here is link to Wikipedia and a quick quote:

 

In a Unix-style file system, the inode is a data structure used to represent a filesystem object, which can be one of various things including a file or a directory. Each inode stores the attributes and disk block location(s) of the filesystem object's data.[1] Filesystem object attributes may include manipulation metadata (e.g. change,[2] access, modify time), as well as owner and permission data (e.g. group-id, user-id, permissions).[3]
Directories are lists of names assigned to inodes. The directory contains an entry for itself, its parent, and each of its children.


Once I understood it is the inodes, I checked how many of them are occupied with cmd:

 

server:~# df -i /home
Filesystem Inodes IUsed IFree IUse% Mounted on
/dev/md2 17006592 17006592 0 100% /home


You see, there were 0 (zero) free file inodes on server and that was the reason for no space left on device while there was actually free disk space

To clean up (free) some inodes on partition, first thing I did is to delete all old logs which were inside /home and files I positively know not to be necessery, then to find which directories allocating most innodes used:

 

server:~# find . -xdev -type f | cut -d "/" -f 2 | sort | uniq -c | sort -n


If you're on a regular old fashined IDE Hard Drive and not SSD or you have too much files inside this command will take really long …:

Therefore a better solution might be to frist:

a) Try to find root folders with large inodes count:

for i in /home/*; do echo $i; find $i |wc -l; done
Try to find specific folders:


You should get output like:

 

/home/new_website
606692
/home/common
73
/home/pcfreak
5661
/home/hipo
33
/home/blog
13570
/home/log
123
/home/lost+found
1

b) Then once you know the directory allocating most inodes, run the command again to see the sub-directories with most files (eating) partition innodes:

 

for i in /home/webservice/*; do echo $i; find $i |wc -l; done

 

One usual large folder which could free you some nodes is the linux source headers, but in my case it was simply a lot of tiny old logs being logged on the system for few years in the past without cleaning:

After deleting the log dirs and cache folder in my case /home/new_website/{log,cache}:

server:~# rm -rf /home/new_website/log/*
server:~# rm -rf /home/new_website/cache/*

 

 

a) Then, stopping Apache webserver to check prevent Apache to use MySQl databases while running database repair and restaring MySQL:
 

server:~# /etc/init.d/apache2 stop Restarting MySQL server
..
server:~# /etc/init.d/mysql restart
..


b) And re-issuing MySQL Check / Repair / Optimize database commands:
 

 

mysqlcheck –defaults-extra-file=/etc/mysql/debian.cnf \–check –all-databases -u root -p`grep -i password /root/.my.cnf |sed -e 's#password=##g'`>> /var/log/cronwork.log

mysqlcheck –defaults-extra-file=/etc/mysql/debian.cnf –analyze –all-databases -u root -p`grep -i password /root/.my.cnf |sed -e 's#password=##g'`>> /var/log/cronwork.log

mysqlcheck –defaults-extra-file=/etc/mysql/debian.cnf \–auto-repair –optimize –all-databases -u root -p`grep -i password /root/.my.cnf |sed -e 's#password=##g'`>> /var/log/cronwork.log

mysqlcheck –defaults-extra-file=/etc/mysql/debian.cnf \–optimize –all-databases -u root -p`grep -i password /root/.my.cnf |sed -e 's#password=##g'`>> /var/log/cronwork.log

c) And finally starting the Apache Webserver again:
 

server:~# /etc/init.d/apache2 start


Some innodse got freed up:
 

server:~# df -i /home Filesystem Inodes IUsed IFree IUse% Mounted on
/dev/md2 17006592 16797196 209396 99% /home


And hooray by God's Grace and with help of prayers of The most Holy Theotokos (Virgin) Mary, websites started again !

Apache Webserver: No space left on device: Couldn’t create accept lock /var/lock/apache2/accept.lock – Fix

Wednesday, April 8th, 2015

Apache-http-server-no-space-left-on-device-semaphores-quotes-hard-disk-space-resolve-fix-howto
If out of a sudden your Apache webserver crashes and is refusing to start up by manually trying to restart it through its init script on Debian Linux servers – /etc/init.d/apache2 and RPM based ones: /etc/init.d/httpd

Checking in php_error.log there was no shown errors related to loading PHP modules, however apache's error.log show following errors:

[Wed Apr 08 14:20:14 2015] [error] [client 180.76.5.149] client denied by server configuration: /var/www/sploits/info/trojans_info/tr_data/y3190.html
[Wed Apr 08 14:20:39 2015] [warn] pid file /var/run/apache2.pid overwritten — Unclean shutdown of previous Apache run?
[Wed Apr 08 14:20:39 2015] [emerg] (28)No space left on device: Couldn't create accept lock (/var/lock/apache2/accept.lock.15974) (5)
[Wed Apr 08 14:25:39 2015] [warn] pid file /var/run/apache2.pid overwritten — Unclean shutdown of previous Apache run?
[Wed Apr 08 14:25:39 2015] [emerg] (28)No space left on device: Couldn't create accept lock (/var/lock/apache2/accept.lock.16790) (5)
[Wed Apr 08 14:27:03 2015] [warn] pid file /var/run/apache2.pid overwritten — Unclean shutdown of previous Apache run?
[Wed Apr 08 14:27:03 2015] [emerg] (28)No space left on device: Couldn't create accept lock (/var/lock/apache2/accept.lock.16826) (5)
[Wed Apr 08 14:27:53 2015] [warn] pid file /var/run/apache2.pid overwritten — Unclean shutdown of previous Apache run?
[Wed Apr 08 14:27:53 2015] [emerg] (28)No space left on device: Couldn't create accept lock (/var/lock/apache2/accept.lock.16852) (5)
[Wed Apr 08 14:30:48 2015] [warn] pid file /var/run/apache2.pid overwritten — Unclean shutdown of previous Apache run?
[Wed Apr 08 14:30:48 2015] [emerg] (28)No space left on device: Couldn't create accept lock (/var/lock/apache2/accept.lock.17710) (5)
[Wed Apr 08 14:31:21 2015] [warn] pid file /var/run/apache2.pid overwritten — Unclean shutdown of previous Apache run?
[Wed Apr 08 14:31:21 2015] [emerg] (28)No space left on device: Couldn't create accept lock (/var/lock/apache2/accept.lock.17727) (5)
[Wed Apr 08 14:32:40 2015] [warn] pid file /var/run/apache2.pid overwritten — Unclean shutdown of previous Apache run?
[Wed Apr 08 14:32:40 2015] [emerg] (28)No space left on device: Couldn't create accept lock (/var/lock/apache2/accept.lock.17780) (5)
[Wed Apr 08 14:38:32 2015] [warn] pid file /var/run/apache2.pid overwritten — Unclean shutdown of previous Apache run?

As you can read the most likely reason behind above errors preventing for apache to start is /var/run/apache2.pid  unable to be properly written due to lack of disk space or due to disk quota set for users including for userID with which Apache is running.

First thing I did is of course to see how much free space is on the server:

df -h
Filesystem                     Size  Used Avail Use% Mounted on
/dev/mapper/vg00-rootvol       4.0G  1.7G  2.2G  44% /
udev                           7.8G  204K  7.8G   1% /dev
tmpfs                           24G     0   24G   0% /dev/shm
/dev/sda1                      486M   40M  422M   9% /boot
/dev/mapper/vg00-lv_crashdump 1008M   34M  924M   4% /crashdump
/dev/mapper/vg00-homevol       496M   26M  445M   6% /home
/dev/mapper/vg00-lv_opt         12G  1.4G  9.9G  13% /opt
/dev/mapper/vg00-tmpvol        2.0G   68M  1.9G   4% /tmp
/dev/mapper/vg00-varvol        7.9G  609M  6.9G   8% /var
/dev/mapper/vg00-crashvol      1.9G   35M  1.8G   2% /var/crash
/dev/mapper/vg00-auditvol      124M  5.6M  113M   5% /var/log/audit
/dev/mapper/vg00-webdienste     60G   12G   48G  19% /webservice

 

As visible from above df command output , there is enough disk on HDD, so this is definitely not the issue:

Then I Checked whether there is Quota enabled on the Linux server with repquota command shows there are no quotas enabled:

# repquota / var/
repquota: Mountpoint (or device) / not found or has no quota enabled.
repquota: Mountpoint (or device) /var not found or has no quota enabled.
repquota: Not all specified mountpoints are using quota.

 

So obviously the only few left possible reason for Apache failing to start after invoked via init script is  either due to left tainted semaphores or due to some server hardware  RAM problem / or a dying  hard disk with bad blocks.

So what are Semaphores? Generally speaking Semaphores are apparatus for conveying information by means of visual signals between applications (something like sockets).They're used for communicating between the active processes of a certain application. In the case of Apache, they’re used to communicate between the parent and child processes, hence if Apache can’t properly write and coordinate these things down, then it can’t communicate properly with all of the processes it starts and hence the Main HTTPD process can't spawn probably its childs preventing Webserver to enter "started mode" and write its PID file.

To check general information about system semaphore arrays there is the ipcs -s command, however my experience is that ipcs -a is more useful (because it lists generally all kind of semaphores) including Semaphore Shared Memory Signals which are the most likely to cause you the problem.

ipcs -a

—— Shared Memory Segments ——–
key        shmid      owner      perms      bytes      nattch     status

—— Semaphore Arrays ——–
key        semid      owner      perms      nsems
0x00000000 22970368   www-data   600        1

—— Message Queues ——–
key        msqid      owner      perms      used-bytes   messages

As you see in my case there is a Semaphore Arrays which had to be cleaned to make Apache2 be able to start again.
 

To clean all left semaphores (arrays) preventing Apache from start properly, use below for one liner bash loop:
 

for i in `ipcs -s | awk '/www-data/ {print $2}'`; do (ipcrm -s $i); done
ipcrm -m 0x63637069


Note that above for loop is specific to Debian on CentOS / Fedora / RHEL and other Linuxes the username with which stucked semaphores might stay will be apache or httpd

Depending on the user with which the Apache Webserver is running, run above loop like so:

For RPM based distros (CentOS / RHEL):

 

for i in `ipcs -s | awk '/apache/ {print $2}'`; do (ipcrm -s $i); done
ipcrm -m 0x63637069


For other distros such as Slackware or FreeBSD or any custom compiled Apache webserver:

for i in `ipcs -s | awk '/httpd/ {print $2}'`; do (ipcrm -s $i); done
ipcrm -m 0x63637069


If there is also Shared Memory Segments you can remove them with ipcrm i.e.:

ipcrm -m 0x63637069


An alternative way to get rid of left uncleaned semaphores is with xargs:
 

ipcs -s | grep nobody | awk ‘ { print $2 } ‘ | xargs ipcrm


Even though this fixes the issue I understood my problems were due to exceeding semaphores, to check default number of set semaphores on Linux Kernel level as well as few Semaphore related values run below sysctl:

sysctl -a | egrep kernel.sem\|kernel.msgmni
kernel.msgmni = 15904

kernel.sem = 250        32000   32      128


As you can see the number of maximum semaphores is quite large so in my case the failure because of left semaphores was most likely due to some kind of Cracker / Automated bot scanner attack or someone trying malicious against the webserver or simply because of some kind of Apache bug or enormous high load the server faced.