Posts Tagged ‘Disk’

Monitoring Linux hardware Hard Drives / Temperature and Disk with lm_sensors / smartd / hddtemp and Zabbix Userparameter lm_sensors report script

Thursday, April 30th, 2020

monitoring-linux-hardware-with-software-temperature-disk-cpu-health-zabbix-userparameter-script

I'm part of a  SysAdmin Team that is partially doing some minor Zabbix imrovements on a custom corporate installed Zabbix in an ongoing project to substitute the previous HP OpenView monitoring for a bunch of Legacy Linux hosts.
As one of the necessery checks to have is regarding system Hardware, the task was to invent some simplistic way to monitor hardware with the Zabbix Monitoring tool.  Monitoring Bare Metal servers hardware of HP / Dell / Fujituse etc. servers  in Linux usually is done with a third party software provided by the Hardware vendor. But as this requires an additional services to run and sometimes is not desired. It was interesting to find out some alternative Linux native ways to do the System hardware monitoring.
Monitoring statistics from the system hardware components can be obtained directly from the server components with ipmi / ipmitool (for more info on it check my previous article Reset and Manage intelligent  Platform Management remote board article).
With ipmi
 hardware health info could be received straight from the ILO / IDRAC / HPMI of the server. However as often the Admin-Lan of the server is in a seperate DMZ secured network and available via only a certain set of routed IPs, ipmitool can't be used.

So what are the other options to use to implement Linux Server Hardware Monitoring?

The tools to use are perhaps many but I know of two which gives you most of the information you ever need to have a prelimitary hardware damage warning system before the crash, these are:
 

1. smartmontools (smartd)

Smartd is part of smartmontools package which contains two utility programs (smartctl and smartd) to control and monitor storage systems using the Self-Monitoring, Analysis and Reporting Technology system (SMART) built into most modern ATA/SATA, SCSI/SAS and NVMe disks

Disk monitoring is handled by a special service the package provides called smartd that does query the Hard Drives periodically aiming to find a warning signs of hardware failures.
The downside of smartd use is that it implies a little bit of extra load on Hard Drive read / writes and if misconfigured could reduce the the Hard disk life time.

 

linux:~#  /usr/sbin/smartctl -a /dev/sdb2
smartctl 6.6 2017-11-05 r4594 [x86_64-linux-4.19.0-5-amd64] (local build)
Copyright (C) 2002-17, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Device Model:     KINGSTON SA400S37240G
Serial Number:    50026B768340AA31
LU WWN Device Id: 5 0026b7 68340aa31
Firmware Version: S1Z40102
User Capacity:    240,057,409,536 bytes [240 GB]
Sector Size:      512 bytes logical/physical
Rotation Rate:    Solid State Device
Device is:        Not in smartctl database [for details use: -P showall]
ATA Version is:   ACS-3 T13/2161-D revision 4
SATA Version is:  SATA 3.2, 6.0 Gb/s (current: 3.0 Gb/s)
Local Time is:    Thu Apr 30 14:05:01 2020 EEST
SMART support is: Available – device has SMART capability.
SMART support is: Enabled

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status:  (0x00) Offline data collection activity
                                        was never started.
                                        Auto Offline Data Collection: Disabled.
Self-test execution status:      (   0) The previous self-test routine completed
                                        without error or no self-test has ever
                                        been run.
Total time to complete Offline
data collection:                (  120) seconds.
Offline data collection
capabilities:                    (0x11) SMART execute Offline immediate.
                                        No Auto Offline data collection support.
                                        Suspend Offline collection upon new
                                        command.
                                        No Offline surface scan supported.
                                        Self-test supported.
                                        No Conveyance Self-test supported.
                                        No Selective Self-test supported.
SMART capabilities:            (0x0002) Does not save SMART data before
                                        entering power-saving mode.
                                        Supports SMART auto save timer.
Error logging capability:        (0x01) Error logging supported.
                                        General Purpose Logging supported.
Short self-test routine
recommended polling time:        (   2) minutes.
Extended self-test routine
recommended polling time:        (  10) minutes.

SMART Attributes Data Structure revision number: 1
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x0032   100   100   000    Old_age   Always       –       100
  9 Power_On_Hours          0x0032   100   100   000    Old_age   Always       –       2820
 12 Power_Cycle_Count       0x0032   100   100   000    Old_age   Always       –       21
148 Unknown_Attribute       0x0000   100   100   000    Old_age   Offline      –       0
149 Unknown_Attribute       0x0000   100   100   000    Old_age   Offline      –       0
167 Unknown_Attribute       0x0000   100   100   000    Old_age   Offline      –       0
168 Unknown_Attribute       0x0012   100   100   000    Old_age   Always       –       0
169 Unknown_Attribute       0x0000   100   100   000    Old_age   Offline      –       0
170 Unknown_Attribute       0x0000   100   100   010    Old_age   Offline      –       0
172 Unknown_Attribute       0x0032   100   100   000    Old_age   Always       –       0
173 Unknown_Attribute       0x0000   100   100   000    Old_age   Offline      –       0
181 Program_Fail_Cnt_Total  0x0032   100   100   000    Old_age   Always       –       0
182 Erase_Fail_Count_Total  0x0000   100   100   000    Old_age   Offline      –       0
187 Reported_Uncorrect      0x0032   100   100   000    Old_age   Always       –       0
192 Power-Off_Retract_Count 0x0012   100   100   000    Old_age   Always       –       16
194 Temperature_Celsius     0x0022   034   052   000    Old_age   Always       –       34 (Min/Max 19/52)
196 Reallocated_Event_Count 0x0032   100   100   000    Old_age   Always       –       0
199 UDMA_CRC_Error_Count    0x0032   100   100   000    Old_age   Always       –       0
218 Unknown_Attribute       0x0032   100   100   000    Old_age   Always       –       0
231 Temperature_Celsius     0x0000   097   097   000    Old_age   Offline      –       97
233 Media_Wearout_Indicator 0x0032   100   100   000    Old_age   Always       –       2104
241 Total_LBAs_Written      0x0032   100   100   000    Old_age   Always       –       1857
242 Total_LBAs_Read         0x0032   100   100   000    Old_age   Always       –       1141
244 Unknown_Attribute       0x0000   100   100   000    Old_age   Offline      –       32
245 Unknown_Attribute       0x0000   100   100   000    Old_age   Offline      –       107
246 Unknown_Attribute       0x0000   100   100   000    Old_age   Offline      –       15940

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
No self-tests have been logged.  [To run self-tests, use: smartctl -t]

Selective Self-tests/Logging not supported

 

2. hddtemp

 

Usually if smartd is used it is useful to also use hddtemp which relies on smartd data.
 The hddtemp program monitors and reports the temperature of PATA, SATA
 or SCSI hard drives by reading Self-Monitoring Analysis and Reporting
 Technology (S.M.A.R.T.)
information on drives that support this feature.
 

linux:~# /usr/sbin/hddtemp /dev/sda1
/dev/sda1: Hitachi HDS721050CLA360: 31°C
linux:~# /usr/sbin/hddtemp /dev/sdc6
/dev/sdc6: KINGSTON SV300S37A120G: 25°C
linux:~# /usr/sbin/hddtemp /dev/sdb2
/dev/sdb2: KINGSTON SA400S37240G: 34°C
linux:~# /usr/sbin/hddtemp /dev/sdd1
/dev/sdd1: WD Elements 10B8: S.M.A.R.T. not available

 

 

3. lm-sensors / i2c-tools 

 Lm-sensors is a hardware health monitoring package for Linux. It allows you
 to access information from temperature, voltage, and fan speed sensors.
i2c-tools
was historically bundled in the same package as lm_sensors but has been seperated cause not all hardware monitoring chips are I2C devices, and not all I2C devices are hardware monitoring chips.

The most basic use of lm-sensors is with the sensors command

 

linux:~# sensors
i350bb-pci-0600
Adapter: PCI adapter
loc1:         +55.0 C  (high = +120.0 C, crit = +110.0 C)

 

coretemp-isa-0000
Adapter: ISA adapter
Physical id 0:  +28.0 C  (high = +78.0 C, crit = +88.0 C)
Core 0:         +26.0 C  (high = +78.0 C, crit = +88.0 C)
Core 1:         +28.0 C  (high = +78.0 C, crit = +88.0 C)
Core 2:         +28.0 C  (high = +78.0 C, crit = +88.0 C)
Core 3:         +28.0 C  (high = +78.0 C, crit = +88.0 C)

 


On CentOS Linux useful tool is also  lm_sensors-sensord.x86_64 – A Daemon that periodically logs sensor readings to syslog or a round-robin database, and warns of sensor alarms.

In Debian Linux there is also the psensors-server (an HTTP server providing JSON Web service which can be used by GTK+ Application to remotely monitor sensors) useful for developers
psesors-server

psensor-linux-graphical-tool-to-check-cpu-hard-disk-temperature-unix

If you have a Xserver installed on the Server accessed with Xclient or via VNC though quite rare,
You can use xsensors or Psensora GTK+ (Widget Toolkit for creating Graphical User Interface) application software.

With this 3 tools it is pretty easy to script one liners and use the Zabbix UserParameters functionality to send hardware report data to a Company's Zabbix Sserver, though Zabbix has already some templates to do so in my case, I couldn't import this templates cause I don't have Zabbix Super-Admin credentials, thus to work around that a sample work around is use script to monitor for higher and critical considered temperature.
Here is a tiny sample script I came up in 1 min time it can be used to used as 1 liner UserParameter and built upon something more complex.

SENSORS_HIGH=`sensors | awk '{ print $6 }'| grep '^+' | uniq`;
SENSORS_CRIT=`sensors | awk '{ print $9 }'| grep '^+' | uniq`; ;SENSORS_STAT=`sensors|grep -E 'Core\s' | awk '{ print $1" "$2" "$3 }' | grep "$SENSORS_HIGH|$SENSORS_CRIT"`;
if [ ! -z $SENSORS_STAT ]; then
echo 'Temperature HIGH';
else 
echo 'Sensors OK';
fi 

Of course there is much more sophisticated stuff to use for monitoring out there


Below script can be easily adapted and use on other Monitoring Platforms such as Nagios / Munin / Cacti / Icinga and there are plenty of paid solutions, but for anyone that wants to develop something from scratch just like me I hope this
article will be a good short introduction.
If you know some other Linux hardware monitoring tools, please share.

Monitor General Server / Desktop system health in console on Linux and FreeBSD

Tuesday, October 4th, 2011

slurm-output-monitoring-networking
saidar
is a text based ncurses program to display live statistics about general system health.

It displays in one refreshable screen (similar to top) statistics about server state of:
CPU, Load, Memory, Swap, Network, I/O disk operations
Besides that saidar supports a ncurses console colors, which makes it more funny to look at.
Saidar extracts the statistics for system state based on libgstrap cross platform statistics library about pc system health.

On Debian, Ubuntu, Fedora, CentOS Linuxes saider is available for install straight from distribution repositories.
On Debian and Ubuntu saidar is installed with cmd:

debian:~# apt-get install saidar
...

On CentOS and Fedora saidar is bundled as a part of statgrab-tools rpm package.
Installing it on 64 bit CentOS with yum is with command:

[root@centos ~]# yum install statgrab-tools.x86_64

Saidar is also available on FreeBSD as a part of the /usr/ports/devel/libgstrab, hence to use on my FreeBSD I had to install the libgstrab port:

freebsd# cd /usr/ports/devel/libstatgrab
freebsd# make install clean

Here is saidar running on my Desktop Debian on Thinkpad in color output:

debian:~# saidar -c

Saidar Linux General statistics Screenshot

I've seen many people, who use various shell scripts to output system monitoring information, this scripts however are often written to just run without efficiency in mind and they put some let's say 1% extra load on the system CPU. This is not the case with saidar which is written in C and hence the program is optimized well for what it does.

Update: Next to saidar I recommend you check out Slurm (Real Time Network Interface Monitor) it can visualizes network interface traffic using ascii graph such as on top of the article. On Debian and Ubuntu Slurm is available and easily installable via simple:
 

apt-get install –yes slurm

 

How rescue unbootable Windows PC, Windows files through files Network copy to remote server shared Folder using Hirens Boot CD

Saturday, November 12th, 2011

hirens-boot-cd-logo-how-to-rescue-unbootable-pc-with-hirens-bootcd
I'm rescuing some files from one unbootable Windows XP using a livecd with Hirens Boot CD 13

In order to rescue the three NTFS Windows partitions files, I mounted them after booting a Mini Linux from Hirens Boot CD.

Mounting NTFS using Hirens BootCD went quite smoothly to mount the 3 partitions I used cmds:

# mount /dev/sda1 /mnt/sda1
# mount /dev/sda2 /mnt/sda2
# mount /dev/sdb1 /mnt/sdb1

After the three NTFS file partitions are mounted I used smbclient to list all the available Network Shares on the remote Network Samba Shares Server which by the way possessed the NETBIOS name of SERVER 😉

# smbclient -L //SERVER/
Enter root's password:
Domain=[SERVER] OS=[Windows 7 Ultimate 7600] Server=[Windows 7 Ultimate 6.1]

Sharename Type Comment
——— —- ——-
!!!MUSIC Disk
ADMIN$ Disk Remote Admin
C$ Disk Default share
Canon Inkjet S9000 (Copy 2) Printer Canon Inkjet S9000 (Copy 2)
D$ Disk Default share
Domain=[SERVER] OS=[Windows 7 Ultimate 7600] Server=[Windows 7 Ultimate 6.1]
Server Comment
——— ——-
Workgroup Master
——— ——-

Further on to mount the //SERVER/D network samba drive – (the location where I wanted to transfer the files from the above 3 mounted partitions):

# mkdir /mnt/D
# mount //192.168.0.100/D /mnt/D
#

Where the IP 192.168.0.100 is actually the local network IP address of the //SERVER win smb machine.

Afterwards I used mc to copy all the files I needed to rescue from all the 3 above mentioned win partitions to the mounted //SERVER/D
 

Frogatto & Friends – One of the TOP 10 Arcade Free Software & Open Source Games for GNU / Linux and FreeBSD

Friday, December 16th, 2011

Frogatto old-school 2d jump and run free software game for GNU / Linux and FreeBSD
1. Frogatto & Friends – Is an Indian Free Software (Open Source) game in the spirit of old-school jump’en runs like Commander Keen, Prehistoric, Jazz Jack Rabbit

The game is really entertaining, the graphics looks approximately nice, the music is awesome, the gamelplay is good even though after some point in the game the moment with “where should I go now, I can’t find exit” comes through and it gets boring.

Generally if you compare with all the existing jump and run arcade games free software games available for Linux and FreeBSD the game will definetely arrange itself in the list of TOP 10 free software Arcade Games
and therefore its my own believe that Frogatto is a game that every GNU / Linux and FreeBSD desktop should have in Application -> Games GNOME menu.

Frogatto is rich of levels, enemies obstacles objects, places to visit (which puts it ahead of many of the linux arcade games which often miss enough game levels, has a too short game plots, or simply miss overall game diversity).

Frogatto linux freebsd game bombing airplaine

The game’s general look & feel is like a professional game and not just some tiny free software arcade, made by its authors for the sake to learn some programming, graphics or music creation.
Frogatto door leading to Grotto

Frogatto Free Software game wood screenshot

Besides that Frogatto & Friends is multi-platform supporting all the major operating systems.
Game supports:
 

  • Windows
  • Mac
  • iPhone
  • Debian GNU / Linux
  • FreeBSD

The game source code is also available on Frogatto.com – The Game’s Official website

The game is available as a deb package in Debian and Ubuntu GNU / Linuxes so to install on those deb based distributions, simply use apt:

debian:~# apt-get install frogatto
...

The above command will install two packages frogatto (containing the game’s main executable binary) and frogatto-data containinng all the game textures, levels, graphics, music etc.

BTW the package saparation on a gamename and gamename-data in Debian (for all those who have not still noticed), can be seen on most of the games with a game data that takes more disk space.

After the game is installed the only way to start the game is to run it manually through pressing ALT+F2 in GNOME or running the progrtam through gnome-terminal with cmd:

debian:~$ frogatto

Here are few more Frogatto gameplay screenshots:

Frogatto free open source game screenshot a game bad guy

Frogatto different level screenshot

I’ve noticed Frogatto is also available as an RPM package for Fedora Linux, as well as has a FreeBSD port in the /usr/ports/games/frogatto and this makes it easy to install on most free software OSes in the wild.

While checking frogatto.com , I found an interesting link to a website offering free graphics (pictures), textures and sounds for free and open source games for all those who hold interest into the development of Free Software & Open Source Games make sure you check OpenGameArt.org

OpenGameArt.org looks like a great initiative and will definitely be highly beneficial to the development of more and better FSOS Games so I wish them God speed with this noble initiative.

Frogatto is very suitable for growing kids since it doesn’t contain no violence and every now and then the main game actor the Frogatto Frog leads few lines English dialogues with some of the characters found in the quest.
For none speaking English countries, the game can help the kids to learn some basic english words and thus can help develop kids intellect and knowledge
And oh yeah one more criticism towards the game is the Enlish structure, it seems people who wrote the plot can work this out in the time to come. Many of the English sentences during dialogues the frog leads with the cranks he met does not sound like a common and sometimes even correct english / phrases.

Besides those little game “defect”, the game is pretty awesome and worthy to kill some time and relax from a long stressy day.

Tracking I/O hard disk server bottlenecks with iostat on GNU / Linux and FreeBSD

Tuesday, March 27th, 2012

Hard disk overhead tracking on Linux and FreeBSD with iostat

I've earlier wrote an article How to find which processes are causing hard disk i/o overhead on Linux there I explained very rawly few tools which can be used to benchmark hard disk read / write operations. My prior article accent was on iotop and dstat and it just mentioned of iostat. Therefore I've wrote this short article in attempt to explain a bit more thoroughfully on how iostat can be used to track problems with excessive server I/O read/writes.

Here is the command man page description;
iostatReport Central Processing Unit (CPU) statistics and input/output statistics for devices, partitions and network filesystems

I will further proceed with few words on how iostat can be installed on various Linux distros, then point at few most common scenarious of use and a short explanation on the meaning of each of the command outputs.

1. Installing iostat on Linux

iostat is a swiss army knife of finding a server hard disk bottlenecks. Though it is a must have tool in the admin outfut, most of Linux distributions will not have iostat installed by default.
To have it on your server, you will need to install sysstat package:

a) On Debian / Ubuntu and other Debian GNU / Linux derivatives to install sysstat:

debian:~# apt-get --yes install sysstat

b) On Fedora, CentOS, RHEL etc. install is with yum:

[root@centos ~]# yum -y install sysstat

c) On Slackware Linux sysstat package which contains iostat is installed by default. 

d) In FreeBSD, there is no need for installation of any external package as iostat is part of the BSD world (bundle commands).
I should mention bsd iostat and Linux's iostat commands are not the same and hence there use to track down hard disk bottlenecks differs a bit, however the general logic of use is very similar as with most tools in BSD and Linux.

2. Checking a server hard disk for i/o disk bottlenecks on G* / Linux

Once having the sysstat installed on G* / Linux systems, the iostat command will be added in /usr/bin/iostat
a) To check what is the hard disk read writes per second (in megabytes) use:

debian:~# /usr/bin/iostat -m
Linux 2.6.32-5-amd64 (debian) 03/27/2012 _x86_64_ (8 CPU)
avg-cpu: %user %nice %system %iowait %steal %idle
15.34 0.36 2.76 2.66 0.00 78.88
Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn
sda 63.89 0.48 8.20 6730223 115541235
sdb 64.12 0.44 8.23 6244683 116039483
md0 2118.70 0.22 8.19 3041643 115528074

In the above output the server, where I issue the command is using sda and sdb configured in software RAID 1 array visible in the output as (md0)

The output of iostat should already be easily to read, for anyone who didn't used the tool here is a few lines explanation of the columns:

The %user 15.34 meaning is that 15.34 out of 100% possible i/o load is generad by system level read/write operations.
%nice – >Show the percentage of CPU utilization that occurred while executing at the user level with nice priority.
%iowait – just like the top command idle it shows the idle time when the system didn't have an outstanding disk I/O requests.
%steal – show percentage in time spent in time wait of CPU or virtual CPUs to service another virtual processor (high numbers of disk is sure sign for i/o problem).
%idle – almost the same as meaning to %iowait
tps – HDD transactions per second
MB_read/s (column) – shows the actual Disk reads in Mbytes at the time of issuing iostat
MB_wrtn/s – displays the writes p/s at the time of iostat invocation
MB_read – shows the hard disk read operations in megabytes, since the server boot 'till moment of invocation of iostat
MB_wrtn – gives the number of Megabytes written on HDD since the last server boot filesystem mount

The reason why the Read / Write values for sda and sdb are similar in this example output is because my disks are configured in software RAID1 (mirror)

The above iostat output reveals in my specific case the server is experiencing mostly Disk writes (observable in the high MB_wrtn/s 8.19 md0 in the above sample output).

It also reveals, the I/O reads experienced on that server hard disk are mostly generated as a system (user level load) – see (%user 15.34 and md0 2118.70).

For all those not familiar with system also called user / level load, this is all kind of load which is generated by running programs on the server – (any kind of load not generated by the Linux kernel or loaded kernel modules).

b) To periodically keep an eye on HDD i/o operations with iostat, there are two ways:

– Use watch in conjunction with iostat;

[root@centos ~]# watch "/usr/bin/iostat -m"
Every 2.0s: iostat -m Tue Mar 27 11:00:30 2012
Linux 2.6.32-5-amd64 (centos) 03/27/2012 _x86_64_ (8 CPU)
avg-cpu: %user %nice %system %iowait %steal %idle
15.34 0.36 2.76 2.66 0.00 78.88
Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn
sda 63.89 0.48 8.20 6730255 115574152
sdb 64.12 0.44 8.23 6244718 116072400
md0 2118.94 0.22 8.20 3041710 115560990
Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn
sda 55.00 0.01 25.75 0 51
sdb 52.50 0.00 24.75 0 49
md0 34661.00 0.01 135.38 0 270

Even though watch use and -d might appear like identical, they're not watch does refresh the screen, executing instruction similar to the clear command which clears screen on every 2 seconds, so the output looks like the top command refresh, while passing the -d 2 will output the iostat command output on every 2 secs in a row so all the data is visualized on the screen. Hence -d 2 in cases, where more thorough debug is necessery is better. However for a quick routine view watch + iostat is great too.

c) Outputting extra information for HDD input/output operations;

root@debian:~# iostat -x
Linux 2.6.32-5-amd64 (debian) 03/27/2012 _x86_64_ (8 CPU)
avg-cpu: %user %nice %system %iowait %steal %idle
15.34 0.36 2.76 2.66 0.00 78.88
Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util
sda 4.22 2047.33 12.01 51.88 977.44 16785.96 278.03 0.28 4.35 3.87 24.72
sdb 3.80 2047.61 11.97 52.15 906.93 16858.32 277.05 0.03 5.25 3.87 24.84
md0 0.00 0.00 20.72 2098.28 441.75 16784.05 8.13 0.00 0.00 0.00 0.00

This command will output extended useful Hard Disk info like;
r/s – number of read requests issued per second
w/s – number of write requests issued per second
rsec/s – numbers of sector reads per second
b>wsec/s – number of sectors wrote per second
etc. etc.

Most of ppl will never need to use this, but it is good to know it exists.

3. Tracking read / write (i/o) hard disk bottlenecks on FreeBSD

BSD's iostat is a bit different in terms of output and arguments.

a) Here is most basic use:

freebsd# /usr/sbin/iostat
tty ad0 cpu
tin tout KB/t tps MB/s us ni sy in id
1 561 45.18 44 1.95 14 0 5 0 82

b) Periodic watch of hdd i/o operations;

freebsd# iostat -c 10
tty ad0 cpu
tin tout KB/t tps MB/s us ni sy in id
1 562 45.19 44 1.95 14 0 5 0 82
0 307 51.96 113 5.73 44 0 24 0 32
0 234 58.12 98 5.56 16 0 7 0 77
0 43 0.00 0 0.00 1 0 0 0 99
0 485 0.00 0 0.00 2 0 0 0 98
0 43 0.00 0 0.00 0 0 1 0 99
0 43 0.00 0 0.00 0 0 0 0 100
...

As you see in the output, there is information like in the columns tty, tin, tout which is a bit hard to comprehend.
Thanksfully the tool has an option to print out only more essential i/o information:

freebsd# iostat -d -c 10
ad0
KB/t tps MB/s
45.19 44 1.95
58.12 97 5.52
54.81 108 5.78
0.00 0 0.00
0.00 0 0.00
0.00 0 0.00
20.48 25 0.50

The output info is quite self-explanatory.

Displaying a number of iostat values for hard disk reads can be also achieved by omitting -c option with:

freebsd# iostat -d 1 10
...

Tracking a specific hard disk partiotion with iostat is done with:

freebsd# iostat -n /dev/ad0s1a
tty cpu
tin tout us ni sy in id
1 577 14 0 5 0 81
c) Getting Hard disk read/write information with gstat

gstat is a FreeBSD tool to print statistics for GEOM disks. Its default behaviour is to refresh the screen in a similar fashion like top command, so its great for people who would like to periodically check all attached system hard disk and storage devices:

freebsd# gstat
dT: 1.002s w: 1.000s
L(q) ops/s r/s kBps ms/r w/s kBps ms/w %busy Name
0 10 0 0 0.0 10 260 2.6 15.6| ad0
0 10 0 0 0.0 10 260 2.6 11.4| ad0s1
0 10 0 0 0.0 10 260 2.8 12.5| ad0s1a
0 0 0 0 0.0 0 0 0.0 20.0| ad0s1b
0 0 0 0 0.0 0 0 0.0 0.0| ad0s1c
0 0 0 0 0.0 0 0 0.0 0.0| ad0s1d
0 0 0 0 0.0 0 0 0.0 0.0| ad0s1e
0 0 0 0 0.0 0 0 0.0 0.0| acd0

It even has colors if your tty supports colors 🙂

Another useful tool in debugging the culprit of excessive hdd I/O operations is procstat command:

Here is a sample procstat run to track (httpd) one of my processes imposing i/o hdd load:

freebsd# procstat -f 50404
PID COMM FD T V FLAGS REF OFFSET PRO NAME
50404 httpd cwd v d -------- - - - /
50404 httpd root v d -------- - - - /
50404 httpd 0 v c r------- 56 0 - -
50404 httpd 1 v c -w------ 56 0 - -
50404 httpd 2 v r -wa----- 56 75581 - /var/log/httpd-error.log
50404 httpd 3 s - rw------ 105 0 TCP ::.80 ::.0
50404 httpd 4 p - rw---n-- 56 0 - -
50404 httpd 5 p - rw------ 56 0 - -
50404 httpd 6 v r -wa----- 56 25161132 - /var/log/httpd-access.log
50404 httpd 7 v r rw------ 56 0 - /tmp/apr8QUOUW
50404 httpd 8 v r -w------ 56 0 - /var/run/accept.lock.49588
50404 httpd 9 v r -w------ 1 0 - /var/run/accept.lock.49588
50404 httpd 10 v r -w------ 1 0 - /tmp/apr8QUOUW
50404 httpd 11 ? - -------- 2 0 - -

Btw fstat is sometimes helpful in identifying the number of open files and trying to estimate which ones are putting the hdd load.
Hope this info helps someone. If you know better ways to track hdd excessive loads on Linux / BSD pls share 'em pls.
 

How to manually disable Windows Genuine Advantage on Windows XP SP2

Wednesday, May 25th, 2011

WGA Notification message popup message

I have a pirate version of Windows XP Pro 2 installer CD which does automatically turn on Windows Genuine Advantage

This is kind of annoying as the computer gets really slow and the hard disk drive activite gets intensive as well as an annoying popup message that the Windows XP copy is not genuine does appear periodically

In order to get rid of the message I had to do the following steps:

1. Get into Windows Safe Mode without Networking

As most of the people knows this is achieved by pressing F8 keyboard key right before the Windows bootup screen appears.

After in Safe mode it’s necessery to,

2. Run Windows Command Line (cmd.exe)

To do so follow, the menus:

Windows (Start Menu) -> Run -> cmd.exe

3. In the command prompt window issue the commands:

C:Documents and SettingsUser> cd WindowsSystem32
C:WindowsSystem32> taskkill -IM wgatray.exe
C:WindowsSystem32> del wgatray.exe
C:WindowsSystem32> move wgalogon.dll wgalogon.dll.old
C:WindowsSystem32> del wgalogon.dll.old

Something to mention is you have to be really quick, with deleting wgalogon.dll, cause wgatray.exe is scheduled to run every 1 / 2 seconds 🙂 It is a bit of situation of type “be quick or be dead” as Maiden used to sing 🙂
A Windows system restart and Hooray the Windows Genuine message is gone 🙂

How to tune MySQL Server to increase MySQL performance using mysqltuner.pl and Tuning-primer.sh

Tuesday, August 9th, 2011

MySQL Easy performance tuning with mysqltuner.pl and Tuning-primer.sh scripts

Improving MySQL performance is crucial for improving a website responce times, reduce server load and improve overall work efficiency of a mysql database server.

I’ve seen however many Linux System administrators who does belittle or completely miss the significance of tuning a newly installed MySQL server installation.
The reason behind that is probably caused by fact that many people think MySQL config variables, would not significantly improve performance and does not pay back for optimization efforts. Moreover there are a bunch of system admins who has to take care for numerous services so they don’t have time to get good knowledge to optimize MySQL servers.
Thus many admins and webmasters nowdays, think optimizations depend mostly on the side of the website programmers.
It’s also sometimes falsely believed that optimizing a MySQL server could reduce the overall server stability.

With the boom of Internet website building and internet marketing, many webmasters emerged and almost anybody with almost no knowledge on GNU/Linux or minimal or no knowledge on PHP can start his Online store, open a blog or create a website powered by some CMS like joomla.
Thus nowdays many servers even doesn’t have a hired system administrators but are managed by people whose knowledge on *Nix is almost next to zero, this is another reason why dozens of MySQL installations online are a default ones and are not taking a good advantage of the server hardware.

The incrase of website visitors leads people servers expectations for hardware also to grow, thus many companies simply buy a new hardware instead of taking the few time to investigate on how current server hardware can be utilized better.
In that manner of thought I though it will be a good idea to write this small article on Tuning mysql servers with two scripts Tuning-primer.sh and mysqltuner.pl.
The scripts are ultra easy to use and does not require only a minimal knowledge on MySQL, Linux or (*BSD *nix if sql is running on BSD).
Tuning-primer.sh and mysqltuner.pl are therefore suitable for a quick MySQL server optimizations to even people who are no computer experts.

I use this two scripts for MySQL server optimizations on almost every new configured GNU/Linux with a MySQL backend.
Use of the script comes to simply download with wget, lynx, curl or some other web client and execute it on the server host which is already running the MySQL server.

Here is an example of how simple it is to run the scripts to Optimize MySQL:

debian:~# perl mysqltuner.pl
>> MySQLTuner 1.2.0 - Major Hayden >major@mhtx.net<
>> Bug reports, feature requests, and downloads at http://mysqltuner.com/
>> Run with '--help' for additional options and output filtering

——– General Statistics ————————————————–
[–] Skipped version check for MySQLTuner script
[OK] Currently running supported MySQL version 5.1.49-3
[OK] Operating on 64-bit architecture

——– Storage Engine Statistics ——————————————-
[–] Status: +Archive -BDB -Federated +InnoDB -ISAM -NDBCluster
[–] Data in MyISAM tables: 6G (Tables: 952)
[!!] InnoDB is enabled but isn’t being used
[!!] Total fragmented tables: 12

——– Security Recommendations ——————————————-
[OK] All database users have passwords assigned

——– Performance Metrics ————————————————-
[–] Up for: 1d 2h 3m 35s (68M q [732.193 qps], 610K conn, TX: 49B, RX: 11B)
[–] Reads / Writes: 76% / 24%
[–] Total buffers: 512.0M global + 2.8M per thread (2000 max threads)
[OK] Maximum possible memory usage: 6.0G (25% of installed RAM)
[OK] Slow queries: 0% (3K/68M)
[OK] Highest usage of available connections: 7% (159/2000)
[OK] Key buffer size / total MyISAM indexes: 230.0M/1.7G
[OK] Key buffer hit rate: 97.8% (11B cached / 257M reads)
[OK] Query cache efficiency: 76.6% (46M cached / 61M selects)
[!!] Query cache prunes per day: 1822075
[OK] Sorts requiring temporary tables: 0% (1K temp sorts / 2M sorts)
[!!] Joins performed without indexes: 63635
[OK] Temporary tables created on disk: 1% (26K on disk / 2M total)
[OK] Thread cache hit rate: 99% (159 created / 610K connections)
[!!] Table cache hit rate: 4% (1K open / 43K opened)
[OK] Open file limit used: 17% (2K/16K)
[OK] Table locks acquired immediately: 99% (36M immediate / 36M locks)

——– Recommendations —————————————————–
General recommendations:
Add skip-innodb to MySQL configuration to disable InnoDB
Run OPTIMIZE TABLE to defragment tables for better performance
Enable the slow query log to troubleshoot bad queries
Increasing the query_cache size over 128M may reduce performance
Adjust your join queries to always utilize indexes
Increase table_cache gradually to avoid file descriptor limits
Variables to adjust:
query_cache_size (> 256M) [see warning above] join_buffer_size (> 256.0K, or always use indexes with joins) table_cache (> 7200)

You see there are plenty of things, the script reports, for the unexperienced most of the information can be happily skipped without need to know the cryptic output, the section of importance here is Recommendations for some clarity, I’ve made this section to show up bold.

The most imporant things from the Recommendations script output is actually the lines who give suggestions for incrase of certain variables for MySQL.In this example case this are the last three variables:
query_cache_size,
join_buffer_size and
table_cache

All of these variables are tuned from /etc/mysql/my.cnf (on Debian) and derivatives distros and from /etc/my.cnf on RHEL, CentOS, Fedora and the other RPM based Linux distributions.

On some custom server installs my.cnf is also located in /usr/local/mysql/etc/ or some other a bit more unstandard location 😉

Anyways now having the Recommendation from the script, it’s necessery to edit my.cnf and try to double the values for the suggested variables.

First, I check if all the suggested variables are existent in my config with grep , if they’re not then I’ll simply add the variable with doubled size of the suggested values.
P.S: One note here is sometimes some values which are configured, are the default value for the MySQL server and does not have a record in my.cnf

debian:~# grep -E 'query_cache_size|join_buffer_size|table_cache' /etc/mysql/my.cnf table_cache = 7200
query_cache_size = 256M
join_buffer_size = 262144

All of my variables are in the config so, now edit my.cnf and set values to:
table_cache = 14400
query_cache_size = 512M
join_buffer_size = 524288

I always, however preserve the old variable’s value, because sometimes raising the value might create problem and the MySql server might be unable to restart properly.
Thus before going with adding the new values make sure the old ones are commented with # , e.g.:
#table_cache = 7200
#query_cache_size = 256M
#join_buffer_size = 262144

I would recommend vim as editor of choice while editing my.cnf as vim completely rox 😉 If you’re not acquainted to vim use nano or mcedit or your editor of choice 😉 :

debian:~# vim /etc/mysql/my.cnf
...

Assuming that the changes are made, it’s time to restart MySQL to make sure the new values are read by the SQL server.

debian:~# /etc/init.d/mysql restart
* Stopping MySQL database server mysqld [ OK ]
* Starting MySQL database server mysqld [ OK ]
Checking for tables which need an upgrade, are corrupt or were not closed cleanly.

If mysql server fails, however to restart, make sure immediately you reverse back the changed variables to the commented values and restart once again via mysql init script to make server load.

Afterwards start adding the values one by one until find out which one is causing the mysqld to fail.

Now the second script (Tuning-primer.sh) is also really nice for MySQL performance optimizations are necessery. However it’s less portable (as it’s written in bash scripting language).
Consider running this script among different GNU/Linux distributious (especially the newer ones) might produce errors.
Tuning-primer.sh requires some minor code changes to be able to run on FreeBSD, NetBSD and OpenBSD *nices.

The way Tuning-primer.sh works is precisely like mysqltuner.pl , one runs it it gives some info about current running MySQL server and based on certain factors gives suggestions on how increasing or decresing certain my.cnf variables could reduce sql query bottlenecks, solve table locking issues as well as generally improve INSERT, UPDATE query times.

Here is an example output from tuning-primer.sh run on another server:

server:~# wget https://www.pc-freak.net/files/Tuning-primer.sh
...
server:~# sh Tuning-primer.sh
-- MYSQL PERFORMANCE TUNING PRIMER --
- By: Matthew Montgomery -

MySQL Version 5.0.51a-24+lenny5 x86_64

Uptime = 8 days 10 hrs 19 min 8 sec
Avg. qps = 179
Total Questions = 130851322
Threads Connected = 1

Server has been running for over 48hrs.
It should be safe to follow these recommendations

To find out more information on how each of these
runtime variables effects performance visit:
http://dev.mysql.com/doc/refman/5.0/en/server-system-variables.html

SLOW QUERIES
Current long_query_time = 1 sec.
You have 16498 out of 130851322 that take longer than 1 sec. to complete
The slow query log is NOT enabled.
Your long_query_time seems to be fine

MAX CONNECTIONS
Current max_connections = 2000
Current threads_connected = 1
Historic max_used_connections = 85
The number of used connections is 4% of the configured maximum.
Your max_connections variable seems to be fine.

WORKER THREADS
Current thread_cache_size = 128
Current threads_cached = 84
Current threads_per_sec = 0
Historic threads_per_sec = 0
Your thread_cache_size is fine

MEMORY USAGE
Tuning-primer.sh: line 994: let: expression expected
Max Memory Ever Allocated : 741 M
Configured Max Memory Limit : 5049 M
Total System Memory : 23640 M

KEY BUFFER
Current MyISAM index space = 1646 M
Current key_buffer_size = 476 M
Key cache miss rate is 1 / 56
Key buffer fill ratio = 90.00 %
You could increase key_buffer_size
It is safe to raise this up to 1/4 of total system memory;
assuming this is a dedicated database server.

QUERY CACHE
Query cache is enabled
Current query_cache_size = 64 M
Current query_cache_used = 38 M
Current Query cache fill ratio = 59.90 %

SORT OPERATIONS
Current sort_buffer_size = 2 M
Current record/read_rnd_buffer_size = 256.00 K
Sort buffer seems to be fine

JOINS
Current join_buffer_size = 128.00 K
You have had 111560 queries where a join could not use an index properly
You have had 91 joins without keys that check for key usage after each row
You should enable “log-queries-not-using-indexes”
Then look for non indexed joins in the slow query log.
If you are unable to optimize your queries you may want to increase your
join_buffer_size to accommodate larger joins in one pass.

TABLE CACHE
Current table_cache value = 3600 tables
You have a total of 798 tables
You have 1904 open tables.
The table_cache value seems to be fine

TEMP TABLES
Current tmp_table_size = 128 M
1% of tmp tables created were disk based
Created disk tmp tables ratio seems fine

TABLE SCANS
Current read_buffer_size = 128.00 K
Current table scan ratio = 797 : 1
read_buffer_size seems to be fine

TABLE LOCKING
Current Lock Wait ratio = 1 : 1782
You may benefit from selective use of InnoDB.

As seen from script output, there are certain variables which might be increased a bit for better SQL performance, one such variable as suggested is key_buffer_size(You could increase key_buffer_size)

Now the steps to make the tunings to my.cnf are precisely the same as with mysqltuner.pl, e.g.:
1. Preserve old config variables which will be changed by commenting them
2. Double value of current variables in my.cnf suggested by script
3. Restart Mysql server via /etc/init.d/mysql restart cmd.
4. If mysql runs fine monitor mysql performance with mtop or mytop for at least 15 mins / half an hour.

if all is fine run once again the tuning scripts to see if there are no further improvement suggestions, if there are more follow the 4 steps described procedure once again.

It’s also a good idea that these scripts are periodically re-run on the server like once per few months as changes in SQL queries amounts and types will require changes in MySQL operational variables.
The authors of these nice scripts has done great job and have saved us a tons of nerves time, downtimes and money spend on meaningless hardware. So big thanks for the awesome scripts guys 😉
Finally after hopefully succesful deployment of changes, enjoy the incresed SQL server performance 😉

How to solve “Incorrect key file for table ‘/tmp/#sql_9315.MYI’; try to repair it” mysql start up error

Saturday, April 28th, 2012

When a server hard disk scape gets filled its common that Apache returns empty (no content) pages…
This just happened in one server I administer. To restore the normal server operation I freed some space by deleting old obsolete backups.
Actually the whole reasons for this mess was an enormous backup files, which on the last monthly backup overfilled the disk empty space.

Though, I freed about 400GB of space on the the root filesystem and on a first glimpse the system had plenty of free hard drive space, still restarting the MySQL server refused to start up properly and spit error:

Incorrect key file for table '/tmp/#sql_9315.MYI'; try to repair it" mysql start up error

Besides that there have been corrupted (crashed) tables, which reported next to above error.
Checking in /tmp/#sql_9315.MYI, I couldn't see any MYI – (MyISAM) format file. A quick google look up revealed that this error is caused by not enough disk space. This was puzzling as I can see both /var and / partitions had plenty of space so this shouldn't be a problem. Also manally creating the file /tmp/#sql_9315.MYI with:

server:~# touch /tmp/#sql_9315.MYI

Didn't help it, though the file created fine. Anyways a bit of a closer examination I've noticed a /tmp filesystem mounted besides with the other file system mounts ????
You can guess my great amazement to find this 1 Megabyte only /tmp filesystem hanging on the server mounted on the server.

I didn't mounted this 1 Megabyte filesystem, so it was either an intruder or some kind of "weird" bug…
I digged in Googling to see, if I can find more on the error and found actually the whole mess with this 1 mb mounted /tmp partition is caused by, just recently introduced Debian init script /etc/init.d/mountoverflowtmp.
It seems this script was introduced in Debian newer releases. mountoverflowtmp is some kind of emergency script, which is triggered in case if the root filesystem/ space gets filled.
The script has only two options:

# /etc/init.d/mountoverflowtmp
Usage: mountoverflowtmp [start|stop]

Once started what it does it remounts the /tmp to be 1 megabyte in size and stops its execution like it never run. Well maybe, the developers had something in mind with introducing this script I will not argue. What I should complain though is the script design is completely broken. Once the script gets "activated" and does its job. This 1MB mount stays like this, even if hard disk space is freed on the root partition – / ….

Hence to cope with this unhandy situation, once I had freed disk space on the root partition for some reason mountoverflowtmp stop option was not working,
So I had to initiate "hard" unmount:

server:~# mount -l /tmp

Also as I had a bunch of crashed tables and to fix them, also issued on each of the broken tables reported on /etc/init.d/mysql start start-up.

server:~# mysql -u root -p
mysql> use Database_Name;
mysql> repair table Table_Name extended;
....

Then to finally solve the stupid Incorrect key file for table '/tmp/#sql_XXYYZZ33444.MYI'; try to repair it error, I had to restart once again the SQL server:

Stopping MySQL database server: mysqld.
Starting MySQL database server: mysqld.
Checking for corrupt, not cleanly closed and upgrade needing tables..
root@server:/etc/init.d#

Tadadadadam!, SQL now loads and works back as before!