Posts Tagged ‘MYD’

How to determine WordPress blogs with most spam on multiple blog hosting server

Thursday, November 27th, 2014

determine_find_blogs_with_most_spam-on-multiple-wordpress-blogs-hosting-server-stop-and-clea-large-amounts-ofrcomment-spam
If you're a hosting company that is hosts Joomla / WordPress / ModX websites (each) on separate servers and thus you end up with servers hosting multiple WordPress customer Blogs only, lets say (100+ WP blogs per host) soon your MySQL blogs databases will be full (overfilled) with spam comments. Blogs with multitude of spam comments reduces the WordPress site attractiveness, takes useless disk space, makes wp databases hard to backup and slowing drastically the SQL server.

As our duty as system administrators is to keep the servers optimized (improve performance) and prevent spam-bots to hammer your Linux servers, its is always a good idea to keep an eye on which hosted blogs attract more spammers and cause server overheads and bad hardware optimization.

WordPress blogs keeps logged comments under database_name.wp_comments  (table) thus the quickest way to find out blogs with largest comments tables is to use Linux's find command and print out only comments tables larger than set size.

Here is how:

find /var/lib/mysql/ -type f -size +1024k -name "*_comments.MYD" -exec ls -lh {} ; | awk '{ print $9 ": " $5 }'


/var/lib/mysql/funny-blog/wp_comments.MYD: 15,7M
/var/lib/mysql/wordblogger/wp_comments.MYD: 5,3M
/var/lib/mysql/loveblog/wp_comments.MYD: 50,5M

A comments database of 1MB means about at least 500+ comments, thus the blog loveblog's wp_comments.MYD = 50,5 Mbs contains probably about 10000! comments and should be definitely checked in a browser, if its overfilled with spam because of bad anti-spam policy or missing currently best wordpress spam catcher plugin Akismet. In cases of lack of client to protect his spam you can write quickly a script to auto mail him and ask him kindly to check / fix his blog spam.
In some cases it is useful to write a few liners bash script to automatically disable users with extraordinary blog spam comments databases (quickest way to do it is to move users blog data under quarantine directory and adding a Blog Suspended static html webpage with text like "Please contact support for more info".

1024k find arguments is 1MB, on a big hosted blogs this might be low and you might want to use (100Mb) = 102400kbytes.
You should note that *_comments.MYD in above find cmd is because though standardly wordpress sets wp_ as a prefix to its created skele table structures it is not always the case. 

In above command example find looks for spam comments in /var/lib/mysql (because this is a Debian Linux server), however on other MySQL custom installs, it might be in another dir i.e. /usr/local/mysql/data etc.

It is useful to set the wp_comments statistics output to execute at least once a day as a cronjob:

crontab -u root -e 00 24 * * * /usr/sbin/check_spammed_blogs.sh

vim /usr/check_soammed_blogs.sh

Set a script like:

#!/bin/sh
find /var/lib/mysql/ -type f -size +1024k -name "*_comments.MYD" -exec ls -lh {} ; | awk '{ print $9 ": " $5 }' | tee -a /var/log/blogs_with_most_spam_comments.log

Though above commands is to run on GNU / Linux, for Windows servers based hosting you can  install GNUWin tools and adapt above cmd using windows standard commands or PowerShell to do the same.
Finally you can might want to use some other SQL script to clear blogs with enormously large tables from spam or clear all unapproved spam comments

How to determine which processes make most writes on the hard drive in GNU / Linux using kernel variable

Thursday, November 13th, 2014

how-to-determine-which-processes-writes-most-to-hard-drive-Linux-Kernel
In Linux there are plenty of tools to measure input / ouput – read / write server bottlenecks. Just to mention a few such are, the native part of all Linux distributions IOSTAT – which is a great tool to measure hard disk bottlenecks. However as iostat requires certain sysadmin skills for novice sys-admins, there is also ofcourse more interactive tools such as DSTAT or even better GLANCE which monitors not only disk writes but memory use, CPU load and Network use.

This tools can help you measure which processes are writting most (a lot) to hard disk drive but there is another quick and efficient way to track disk i/o by directly using the Linux kernel this is done via kernel parameter :

/proc/sys/vm/block_dump

To enable block_dump kernel logging:

echo 1 > /proc/sys/vm/block_dump

To later track in real time output from kernel interactively on which running process calling the kernel is writing to server hard drive
 

tail -f /var/log/syslog

The output  looks like so:
 

Nov 13 12:25:51 pcfreak kernel: [1075037.701056] kjournald(297): WRITE block 482293496 on sda1
Nov 13 12:25:51 pcfreak kernel: [1075037.701059] kjournald(297): WRITE block 482293504 on sda1
Nov 13 12:25:51 pcfreak kernel: [1075037.701062] kjournald(297): WRITE block 482293512 on sda1
Nov 13 12:25:51 pcfreak kernel: [1075037.701066] kjournald(297): WRITE block 482293520 on sda1
Nov 13 12:25:51 pcfreak kernel: [1075037.701069] kjournald(297): WRITE block 482293528 on sda1
Nov 13 12:25:51 pcfreak kernel: [1075037.701072] kjournald(297): WRITE block 482293536 on sda1
Nov 13 12:25:51 pcfreak kernel: [1075037.702824] kjournald(297): WRITE block 482293544 on sda1
Nov 13 12:25:52 pcfreak kernel: [1075039.219288] apache2(3377): dirtied inode 3571740 (_index.html.old) on sda1
Nov 13 12:25:52 pcfreak kernel: [1075039.436133] mysqld(22945): dirtied inode 21546676 (#sql_c0a_0.MYI) on sda1
Nov 13 12:25:52 pcfreak kernel: [1075039.436826] mysqld(22945): dirtied inode 21546677 (#sql_c0a_0.MYD) on sda1
Nov 13 12:25:53 pcfreak kernel: [1075039.662832] mysqld(22945): dirtied inode 21546676 (#sql_c0a_0.MYI) on sda1
Nov 13 12:25:53 pcfreak kernel: [1075039.663297] mysqld(22945): dirtied inode 21546677 (#sql_c0a_0.MYD) on sda1
Nov 13 12:25:53 pcfreak kernel: [1075039.817120] apache2(3377): dirtied inode 3571754 (_index.html) on sda1
Nov 13 12:25:53 pcfreak kernel: [1075039.819968] apache2(3377): dirtied inode 3571740 (_index.html_gzip) on sda1
Nov 13 12:25:53 pcfreak kernel: [1075039.820016] apache2(3377): dirtied inode 3571730 (?) on sda1
Nov 13 12:25:53 pcfreak kernel: [1075040.491378] mysqld(22931): dirtied inode 21546676 (#sql_c0a_0.MYI) on sda1
Nov 13 12:25:53 pcfreak kernel: [1075040.492309] mysqld(22931): dirtied inode 21546677 (#sql_c0a_0.MYD) on sda1
Nov 13 12:25:54 pcfreak kernel: [1075041.551513] apache2(3377): dirtied inode 1474706 (_index.html_gzip.old) on sda1
Nov 13 12:25:54 pcfreak kernel: [1075041.551566] apache2(3377): dirtied inode 1474712 (_index.html.old) on sda1
Nov 13 12:25:55 pcfreak kernel: [1075041.769036] mysqld(22941): dirtied inode 21546676 (#sql_c0a_0.MYI) on sda1
Nov 13 12:25:55 pcfreak kernel: [1075041.769804] mysqld(22941): dirtied inode 21546677 (#sql_c0a_0.MYD) on sda1
Nov 13 12:25:55 pcfreak kernel: [1075041.985857] apache2(3282): dirtied inode 4063282 (data_9d97a7f62d54bc5fd791fba3245ba591-SMF-modSettings.php) on sda1
Nov 13 12:25:55 pcfreak kernel: [1075041.987460] apache2(3282): dirtied inode 29010186 (data_9d97a7f62d54bc5fd791fba3245ba591-SMF-permissions–1.php) on sda1
Nov 13 12:25:55 pcfreak kernel: [1075041.988357] flush-8:0(289): WRITE block 51350632 on sda1

Using the kernel method to see which processes are stoning your server is great way especially for servers without connectivity to the Internet where you have no possibility to install sysstat package (contaning iostat),  dstat or glance.
Thanks to Marto's blog for  this nice hack.