Posts Tagged ‘processor’

How to test RAM Memory for errors in Linux / UNIX OS servers. Find broken memory RAM banks

Friday, December 3rd, 2021

test-ram-memory-for-errors-linux-unix-find-broken-memory-logo

 

1. Testing the memory with motherboard integrated tools
 

Memory testing has been integral part of Computers for the last 50 years. In the dawn of computers those older perhaps remember memory testing was part of the computer initialization boot. And this memory testing was delaying the boot with some seconds and the user could see the memory numbers being counted up to the amount of memory. With the increased memory modern computers started to have and the annoyance to wait for a memory check program to check the computer hardware memory on modern computers this check has been mitigated or completely removed on some hardware.
Thus under some circumstances sysadmins or advanced computer users might need to check the memory, especially if there is some suspicion for memory damages or if for example a home PC starts crashing with Blue screens of Death on Windows without reason or simply the PC or some old arcane Linux / UNIX servers gets restarted every now and then for now apparent reason. When such circumstances occur it is an idea to start debugging the hardware issue with a simple memory check.

There are multiple ways to test installed memory banks on a server laptop or local home PC both integrated and using external programs.
On servers that is usually easily done from ILO or IPMI or IDRAC access (usually web) interface of the vendor, on laptops and home usage from BIOS or UEFI (Unified Extensible Firmware Interface) acces interface on system boot that is possible as well.

memtest-hp
HP BIOS Setup

An old but gold TIP, more younger people might not know is the

 

Prolonged SHIFT key press which once held with the user instructs the machine to initiate a memory test before the computer starts reading what is written in the boot loader.

So before anything else from below article it might be a good idea to just try HOLD SHIFT for 15-20 seconds after a complete Shut and ON from the POWER button.

If this test does not triggered or it is triggered and you end up with some corrupted memory but you're not sure which exact Memory bank is really crashing and want to know more on what memory Bank and segments are breaking up you might want to do a more thorough testing. In below article I'll try to explain shortly how this can be done.


2. Test the memory using a boot USB Flash Drive / DVD / CD 
 

Say hello to memtest86+. It is a Linux GRUB boot loader bootable utility that tests physical memory by writing various patterns to it and reading them back. Since memtest86+ runs directly off the hardware it does not require any operating system support for execution. Perhaps it is important to mention that memtest86 (is PassMark memtest86)and memtest86+ (An Advanced Memory diagnostic tool) are different tools, the first is freeware and second one is FOSS software.

To use it all you'll need is some version of Linux. If you don't already have some burned in somewhere at your closet, you might want to burn one.
For Linux / Mac users this is as downloading a Linux distribution ISO file and burning it with

# dd if=/path/to/iso of=/dev/sdbX bs=80M status=progress


Windows users can burn a Live USB with whatever Linux distro or download and burn the latest versionof memtest86+ from https://www.memtest.org/  on Windows Desktop with some proggie like lets say UnetBootIn.
 

2.1. Run memtest86+ on Ubuntu

Many Linux distributions such as Ubuntu 20.0 comes together with memtest86+, which can be easily invoked from GRUB / GRUB2 Kernel boot loader.
Ubuntu has a separate menu pointer for a Memtest.

ubuntu-grub-2-04-boot-loader-memtest86-menu-screenshot

Other distributions RPM based distributions such as CentOS, Fedora Linux, Redhat things differ.

2.2. memtest86+ on Fedora


Fedora used to have the memtest86+ menu at the GRUB boot selection prompt, but for some reason removed it and in newest Fedora releases as of time such as Fedora 35 memtest86+ is preinstalled and available but not visible, to start on  already and to start a memtest memory test tool:

  •   Boot a Fedora installation or Rescue CD / USB. At the prompt, type "memtest86".

boot: memtest86

2.3 memtest86+ on RHEL Linux

The memtest86+tool is available as an RPM package from Red Hat Network (RHN) as well as a boot option from the Red Hat Enterprise Linux rescue disk.
And nowadays Red Hat Enterprise Linux ships by default with the tool.

Prior redhat (now legacy) releases such as on RHEL 5.0 it has to be installed and configure it with below 3 commands.

[root@rhel ~]# yum install memtest86+
[root@rhel ~]# memtest-setup
[root@rhel ~]# grub2-mkconfig -o /boot/grub2/grub.cfg


    Again as with CentOS to boot memtest86+ from the rescue disk, you will need to boot your system from CD 1 of the Red Hat Enterprise Linux installation media, and type the following at the boot prompt (before the Linux kernel is started):

boot: memtest86

memtestx86-8gigabytes-of-memory-boot-screenshot
memtest86+ testing 5 memory slots

As you see all on above screenshot the Memory banks are listed as Slots. There are a number of Tests to be completed until
it can be said for sure memory does not have any faulty cells. 
The

Pass: 0
Errors: 0 

Indicates no errors, so in the end if memtest86 does not find anything this values should stay at zero.
memtest86+ is also usable to detecting issues with temperature of CPU. Just recently I've tested a PC thinking that some memory has defects but it turned out the issue on the Computer was at the CPU's temperature which was topping up at 80 – 82 Celsius.

If you're unfortunate and happen to get some corrupted memory segments you will get some red fields with the memory addresses found to have corrupted on Read / Write test operations:

memtest86-returning-memory-address-errors-screenshot


2.4. Install and use memtest and memtest86+ on Debian / Mint Linux

You can install either memtest86+ or just for the fun put both of them and play around with both of them as they have a .deb package provided out of debian non-free /etc/apt/sources.list repositories.


root@jeremiah:/home/hipo# apt-cache show memtest86 memtest86+
Package: memtest86
Version: 4.3.7-3
Installed-Size: 302
Maintainer: Yann Dirson <dirson@debian.org>
Architecture: amd64
Depends: debconf (>= 0.5) | debconf-2.0
Recommends: memtest86+
Suggests: hwtools, memtester, kernel-patch-badram, grub2 (>= 1.96+20090523-1) | grub (>= 0.95+cvs20040624), mtools
Description-en: thorough real-mode memory tester
 Memtest86 scans your RAM for errors.
 .
 This tester runs independently of any OS – it is run at computer
 boot-up, so that it can test *all* of your memory.  You may want to
 look at `memtester', which allows testing your memory within Linux,
 but this one won't be able to test your whole RAM.
 .
 It can output a list of bad RAM regions usable by the BadRAM kernel
 patch, so that you can still use you old RAM with one or two bad bits.
 .
 This is the last DFSG-compliant version of this software, upstream
 has opted for a proprietary development model starting with 5.0.  You
 may want to consider using memtest86+, which has been forked from an
 earlier version of memtest86, and provides a different set of
 features.  It is available in the memtest86+ package.
 .
 A convenience script is also provided to make a grub-legacy-based
 floppy or image.

Description-md5: 0ad381a54d59a7d7f012972f613d7759
Homepage: http://www.memtest86.com/
Section: misc
Priority: optional
Filename: pool/main/m/memtest86/memtest86_4.3.7-3_amd64.deb
Size: 45470
MD5sum: 8dd2a4c52910498d711fbf6b5753bca9
SHA256: 09178eca21f8fd562806ccaa759d0261a2d3bb23190aaebc8cd99071d431aeb6

Package: memtest86+
Version: 5.01-3
Installed-Size: 2391
Maintainer: Yann Dirson <dirson@debian.org>
Architecture: amd64
Depends: debconf (>= 0.5) | debconf-2.0
Suggests: hwtools, memtester, kernel-patch-badram, memtest86, grub-pc | grub-legacy, mtools
Description-en: thorough real-mode memory tester
 Memtest86+ scans your RAM for errors.
 .
 This tester runs independently of any OS – it is run at computer
 boot-up, so that it can test *all* of your memory.  You may want to
 look at `memtester', which allows to test your memory within Linux,
 but this one won't be able to test your whole RAM.
 .
 It can output a list of bad RAM regions usable by the BadRAM kernel
 patch, so that you can still use your old RAM with one or two bad bits.
 .
 Memtest86+ is based on memtest86 3.0, and adds support for recent
 hardware, as well as a number of general-purpose improvements,
 including many patches to memtest86 available from various sources.
 .
 Both memtest86 and memtest86+ are being worked on in parallel.
Description-md5: aa685f84801773ef97fdaba8eb26436a
Homepage: http://www.memtest.org/

Tag: admin::benchmarking, admin::boot, hardware::storage:floppy,
 interface::text-mode, role::program, scope::utility, use::checking
Section: misc
Priority: optional
Filename: pool/main/m/memtest86+/memtest86+_5.01-3_amd64.deb
Size: 75142
MD5sum: 4f06523532ddfca0222ba6c55a80c433
SHA256: ad42816e0b17e882713cc6f699b988e73e580e38876cebe975891f5904828005
 

 

root@jeremiah:/home/hipo# apt-get install –yes memtest86+

root@jeremiah:/home/hipo# apt-get install –yes memtest86

Reading package lists… Done
Building dependency tree       
Reading state information… Done
Suggested packages:
  hwtools kernel-patch-badram grub2 | grub
The following NEW packages will be installed:
  memtest86
0 upgraded, 1 newly installed, 0 to remove and 21 not upgraded.
Need to get 45.5 kB of archives.
After this operation, 309 kB of additional disk space will be used.
Get:1 http://ftp.de.debian.org/debian buster/main amd64 memtest86 amd64 4.3.7-3 [45.5 kB]
Fetched 45.5 kB in 0s (181 kB/s)     
Preconfiguring packages …
Selecting previously unselected package memtest86.
(Reading database … 519985 files and directories currently installed.)
Preparing to unpack …/memtest86_4.3.7-3_amd64.deb …
Unpacking memtest86 (4.3.7-3) …
Setting up memtest86 (4.3.7-3) …
Generating grub configuration file …
Found background image: saint-John-of-Rila-grub.jpg
Found linux image: /boot/vmlinuz-4.19.0-18-amd64
Found initrd image: /boot/initrd.img-4.19.0-18-amd64
Found linux image: /boot/vmlinuz-4.19.0-17-amd64
Found initrd image: /boot/initrd.img-4.19.0-17-amd64
Found linux image: /boot/vmlinuz-4.19.0-8-amd64
Found initrd image: /boot/initrd.img-4.19.0-8-amd64
Found linux image: /boot/vmlinuz-4.19.0-6-amd64
Found initrd image: /boot/initrd.img-4.19.0-6-amd64
Found linux image: /boot/vmlinuz-4.19.0-5-amd64
Found initrd image: /boot/initrd.img-4.19.0-5-amd64
Found linux image: /boot/vmlinuz-4.9.0-8-amd64
Found initrd image: /boot/initrd.img-4.9.0-8-amd64
Found memtest86 image: /boot/memtest86.bin
Found memtest86+ image: /boot/memtest86+.bin
Found memtest86+ multiboot image: /boot/memtest86+_multiboot.bin
File descriptor 3 (pipe:[66049]) leaked on lvs invocation. Parent PID 22581: /bin/sh
done
Processing triggers for man-db (2.8.5-2) …

 

After this both memory testers memtest86+ and memtest86 will appear next to the option of booting a different version kernels and the Advanced recovery kernels, that you usually get in the GRUB boot prompt.

2.5. Use memtest embedded tool on any Linux by adding a kernel variable

Edit-Grub-Parameters-add-memtest-4-to-kernel-boot

2.4.1. Reboot your computer

# reboot

2.4.2. At the GRUB boot screen (with UEFI, press Esc).

2.4.3 For 4 passes add temporarily the memtest=4 kernel parameter.
 

memtest=        [KNL,X86,ARM,PPC,RISCV] Enable memtest
                Format: <integer>
                default : 0 <disable>
                Specifies the number of memtest passes to be
                performed. Each pass selects another test
                pattern from a given set of patterns. Memtest
                fills the memory with this pattern, validates
                memory contents and reserves bad memory
                regions that are detected.


3. Install and use memtester Linux tool
 

At some condition, memory is the one of the suspcious part, or you just want have a quick test. memtester  is an effective userspace tester for stress-testing the memory subsystem.  It is very effective at finding intermittent and non-deterministic faults.

The advantage of memtester "live system check tool is", you can check your system for errors while it's still running. No need for a restart, just run that application, the downside is that some segments of memory cannot be thoroughfully tested as you already have much preloaded data in it to have the Operating Sytstem running, thus always when possible try to stick to rule to test the memory using memtest86+  from OS Boot Loader, after a clean Machine restart in order to clean up whole memory heap.

Anyhow for a general memory test on a Critical Legacy Server  (if you lets say don't have access to Remote Console Board, or don't trust the ILO / IPMI Hardware reported integrity statistics), running memtester from already booted is still a good idea.


3.1. Install memtester on any Linux distribution from source

wget http://pyropus.ca/software/memtester/old-versions/memtester-4.2.2.tar.gz
# tar zxvf memtester-4.2.2.tar.gz
# cd memtester-4.2.2
# make && make install

3.2 Install on RPM based distros

 

On Fedora memtester is available from repositories however on many other RPM based distros it is not so you have to install it from source.

[root@fedora ]# yum install -y memtester

 

3.3. Install memtester on Deb based Linux distributions from source
 

To install it on Debian / Ubuntu / Mint etc. , open a terminal and type:
 

root@linux:/ #  apt install –yes memtester

The general run syntax is:

memtester [-p PHYSADDR] [ITERATIONS]


You can hence use it like so:

hipo@linux:/ $ sudo memtester 1024 5

This should allocate 1024MB of memory, and repeat the test 5 times. The more repeats you run the better, but as a memtester run places a great overall load on the system you either don't increment the runs too much or at least run it with  lowered process importance e.g. by nicing the PID:

hipo@linux:/ $ nice -n 15 sudo memtester 1024 5

 

  • If you have more RAM like 4GB or 8GB, it is upto you how much memory you want to allocate for testing.
  • As your operating system, current running process might take some amount of RAM, Please check available Free RAM and assign that too memtester.
  • If you are using a 32 Bit System, you cant test more than 4 GB even though you have more RAM( 32 bit systems doesnt support more than 3.5 GB RAM as you all know).
  • If your system is very busy and you still assigned higher than available amount of RAM, then the test might get your system into a deadlock, leads to system to halt, be aware of this.
  • Run the memtester as root user, so that memtester process can malloc the memory, once its gets hold on that memory it will try to apply lock. if specified memory is not available, it will try to reduce required RAM automatically and try to lock it with mlock.
  • if you run it as a regular user, it cant auto reduce the required amount of RAM, so it cant lock it, so it tries to get hold on that specified memory and starts exhausting all system resources.


If you have 8 Gigas of RAM plugged into the PC motherboard you have to multiple 1024*8 this is easily done with bc (An arbitrary precision calculator language) tool:

root@linux:/ # bc -l
bc 1.07.1
Copyright 1991-1994, 1997, 1998, 2000, 2004, 2006, 2008, 2012-2017 Free Software Foundation, Inc.
This is free software with ABSOLUTELY NO WARRANTY.
For details type `warranty'. 
8*1024
8192


 for example you should run:

root@linux:/ # memtester 8192 5

memtester version 4.3.0 (64-bit)
Copyright (C) 2001-2012 Charles Cazabon.
Licensed under the GNU General Public License version 2 (only).

pagesize is 4096
pagesizemask is 0xfffffffffffff000
want 8192MB (2083520512 bytes)
got  8192MB (2083520512 bytes), trying mlock …Loop 1/1:
  Stuck Address       : ok        
  Random Value        : ok
  Compare XOR         : ok
  Compare SUB         : ok
  Compare MUL         : ok
  Compare DIV         : ok
  Compare OR          : ok
  Compare AND         : ok
  Sequential Increment: ok
  Solid Bits          : ok        
  Block Sequential    : ok        
  Checkerboard        : ok        
  Bit Spread          : ok        
  Bit Flip            : ok        
  Walking Ones        : ok        
  Walking Zeroes      : ok        
  8-bit Writes        : ok
  16-bit Writes       : ok

Done.

 

4. Shell Script to test server memory for corruptions
 

If for some reason the machine you want to run a memory test doesn't have connection to the external network such as the internet and therefore you cannot configure a package repository server and install memtester, the other approach is to use a simple memory test script such as memtestlinux.sh
 

#!/bin/bash
# Downloaded from https://www.srv24x7.com/memtest-linux/
echo "ByteOnSite Memory Test"
cpus=`cat /proc/cpuinfo | grep processor | wc -l`
if [ $cpus -lt 6 ]; then
threads=2
else
threads=$(($cpus / 2))
fi
echo "Detected $cpus CPUs, using $threads threads.."
memory=`free | grep 'Mem:' | awk {'print $2'}`
memoryper=$(($memory / $threads))
echo "Detected ${memory}K of RAM ($memoryper per thread).."
freespace=`df -B1024 . | tail -n1 | awk {'print $4'}`
if [ $freespace -le $memory ]; then
echo You do not have enough free space on the current partition. Minimum: $memory bytes
exit 1
fi
echo "Clearing RAM Cache.."
sync; echo 3 > /proc/sys/vm/drop_cachesfile
echo > dump.memtest.img
echo "Writing to dump file (dump.memtest.img).."
for i in `seq 1 $threads`;
do
# 1044 is used in place of 1024 to ensure full RAM usage (2% over allocation)
dd if=/dev/urandom bs=$memoryper count=1044 >> dump.memtest.img 2>/dev/null &
pids[$i]=$!
echo $i
done
for pid in "${pids[@]}"
do
wait $pid
done

echo "Reading and analyzing dump file…"
echo "Pass 1.."
md51=`md5sum dump.memtest.img | awk {'print $1'}`
echo "Pass 2.."
md52=`md5sum dump.memtest.img | awk {'print $1'}`
echo "Pass 3.."
md53=`md5sum dump.memtest.img | awk {'print $1'}`
if [ “$md51” != “$md52” ]; then
fail=1
elif [ “$md51” != “$md53” ]; then
fail=1
elif [ “$md52” != “$md53” ]; then
fail=1
else
fail=0
fi
if [ $fail -eq 0 ]; then
echo "Memory test PASSED."
else
echo "Memory test FAILED. Bad memory detected."
fi
rm -f dump.memtest.img
exit $fail

Nota Bene !: Again consider the restults might not always be 100% trustable if possible restart the server and test with memtest86+

Consider also its important to make sure prior to script run,  you''ll have enough disk space to produce the dump.memtest.img file – file is created as a test bed for the memory tests and if not scaled properly you might end up with a full ( / ) root directory!

 

4.1 Other memory test script with dd and md5sum checksum

I found this solution on the well known sysadmin site nixCraft cyberciti.biz, I think it makes sense and quicker.

First find out memory site using free command.
 

# free
             total       used       free     shared    buffers     cached
Mem:      32867436   32574160     293276          0      16652   31194340
-/+ buffers/cache:    1363168   31504268
Swap:            0          0          0


It shows that this server has 32GB memory,
 

# dd if=/dev/urandom bs=32867436 count=1050 of=/home/memtest


free reports by k and use 1050 is to make sure file memtest is bigger than physical memory.  To get better performance, use proper bs size, for example 2048 or 4096, depends on your local disk i/o,  the rule is to make bs * count > 32 GB.
run

# md5sum /home/memtest; md5sum /home/memtest; md5sum /home/memtest


If you see md5sum mismatch in different run, you have faulty memory guaranteed.
The theory is simple, the file /home/memtest will cache data in memory by filling up all available memory during read operation. Using md5sum command you are reading same data from memory.


5. Other ways to test memory / do a machine stress test

Other good tools you might want to check for memory testing is mprime – ftp://mersenne.org/gimps/ 
(https://www.mersenne.org/ftp_root/gimps/)

  •  (mprime can also be used to stress test your CPU)

Alternatively, use the package stress-ng to run all kind of stress tests (including memory test) on your machine.
Perhaps there are other interesting tools for a diagnosis of memory if you know other ones I miss, let me know in the comment section.

How to check how many processor and volume groups IBM AIX eServer have

Monday, July 13th, 2020

how-many-cpus-are-on-commands-Linux-sysadmin-and-user-show-know-AIX-logo
In daily sysadmin duties I have been usually administrating GNU / Linux or FreeBSD servers.
However now in my daily sysadmin jobs I've been added to do some minor sysadmin activities on  a few IBM AIX eServers UNIX machines.

As the eServers were completely unknown to me and I logged in for a first time I needed a way to get idea on what kind of hardware I'm logging in so I wanted to get information about the Central Processing UNIT CPUs on the host.

On Linux I'm used to do a cat /proc/cpuinfo or do dmidecode etc. to get the number of CPUs, however AIX does not have /proc/cpuinfo and has its own way to get information about the system hardware.
As I've red in the IBM AIX's RedBook to get system information on AIX there is the lscfg command.
 

aix:/# lscfg
INSTALLED RESOURCE LIST

The following resources are installed on the machine.
+/- = Added or deleted from Resource List.
*   = Diagnostic support not available.

  Model Architecture: chrp
  Model Implementation: Multiple Processor, PCI bus

+ sys0                                                            System Object
+ sysplanar0                                                      System Planar
* vio0                                                            Virtual I/O Bus
* vscsi3           U8205.E6B.068D6AP-V4-C21-T1                    Virtual SCSI Client Adapter
* vscsi2           U8205.E6B.068D6AP-V4-C20-T1                    Virtual SCSI Client Adapter
* vscsi1           U8205.E6B.068D6AP-V4-C11-T1                    Virtual SCSI Client Adapter
* hdisk1           U8205.E6B.068D6AP-V4-C11-T1-L8100000000000000  Virtual SCSI Disk Drive
* vscsi0           U8205.E6B.068D6AP-V4-C10-T1                    Virtual SCSI Client Adapter
* hdisk0           U8205.E6B.068D6AP-V4-C10-T1-L8100000000000000  Virtual SCSI Disk Drive
* ent3             U8205.E6B.068D6AP-V4-C5-T1                     Virtual I/O Ethernet Adapter (l-lan)
* ent2             U8205.E6B.068D6AP-V4-C4-T1                     Virtual I/O Ethernet Adapter (l-lan)
* ent1             U8205.E6B.068D6AP-V4-C3-T1                     Virtual I/O Ethernet Adapter (l-lan)
* ent0             U8205.E6B.068D6AP-V4-C2-T1                     Virtual I/O Ethernet Adapter (l-lan)
* vsa0             U8205.E6B.068D6AP-V4-C0                        LPAR Virtual Serial Adapter
* vty0             U8205.E6B.068D6AP-V4-C0-L0                     Asynchronous Terminal
+ L2cache0                                                        L2 Cache
+ mem0                                                            Memory
+ proc0                                                           Processor
+ proc4                                                           Processor


To get the number of processors on the host I've had to use:

 

aix:/# lscfg|grep -i proc
  Model Implementation: Multiple Processor, PCI bus
+ proc0                                                           Processor
+ proc4                                                           Processor


Another way to get the CPU number is with:

aix:/# lsdev -C -c processor
proc0 Available 00-00 Processor
proc4 Available 00-04 Processor

 

aix:/# lsattr -EH -l proc4
attribute   value          description           user_settable

 

frequency   3720000000     Processor Speed       False
smt_enabled true           Processor SMT enabled False
smt_threads 4              Processor SMT threads False
state       enable         Processor state       False
type        PowerPC_POWER7 Processor type        False

aix:/# lsattr -EH -l proc0
attribute   value          description           user_settable

 

frequency   3720000000     Processor Speed       False
smt_enabled true           Processor SMT enabled False
smt_threads 4              Processor SMT threads False
state       enable         Processor state       False
type        PowerPC_POWER7 Processor type        False


As you can see each of the processor is multicore has 2 Cores and each of the cores have for Threads, to get the overall number of CPUs on the system including the threaded Virtual CPUs:

aix:/# bindprocessor -q
The available processors are:  0 1 2 3 4 5 6 7


This specific machine has overall of 8 CPUs cores.

lscfg can be used to get various useful other info of the iron:

aix:/# lscfg -s
INSTALLED RESOURCE LIST

 

The following resources are installed on the machine.
+/- = Added or deleted from Resource List.
*   = Diagnostic support not available.

  Model Architecture: chrp
  Model Implementation: Multiple Processor, PCI bus

+ sys0
        System Object
+ sysplanar0
        System Planar
* vio0
        Virtual I/O Bus
* vscsi3           U8305…………….
        Virtual SCSI Client Adapter
* vscsi2           U8305…………….
        Virtual SCSI Client Adapter
* vscsi1           U8305…………….
        Virtual SCSI Client Adapter
* hdisk1           U8305…………….
        Virtual SCSI Disk Drive
* vscsi0           U8305……………..
        Virtual SCSI Client Adapter
* hdisk0           U8305…………….
        Virtual SCSI Disk Drive
* ent3             U8305…………….
        Virtual I/O Ethernet Adapter (l-lan)
* ent2             U8305.E6B…………….
        Virtual I/O Ethernet Adapter (l-lan)
* ent1             U8305.E6B…………….
        Virtual I/O Ethernet Adapter (l-lan)
* ent0             U8305.E6B…………….
        Virtual I/O Ethernet Adapter (l-lan)
* vsa0             U8305.E7B…………….
        LPAR Virtual Serial Adapter
* vty0             U8305.E7B…………….
        Asynchronous Terminal
+ L2cache0
        L2 Cache
+ mem0
        Memory
+ proc0
        Processor
+ proc4
        Processor

aix:/# lscfg -p
INSTALLED RESOURCE LIST

The following resources are installed on the machine.

  Model Architecture: chrp
  Model Implementation: Multiple Processor, PCI bus

  sys0                                                            System Object
  sysplanar0                                                      System Planar
  vio0                                                            Virtual I/O Bus
  vscsi3           U8305.E7B…………….V6-C40-T1                    Virtual SCSI Client Adapter
  vscsi2           U8305.E7B…………….V6-C40-T1                     Virtual SCSI Client Adapter
  vscsi1           U8305.E7B…………….V6-C40-T1                    Virtual SCSI Client Adapter
  hdisk1           U8305.E7B…………….V6-C40-T1-L8500000000000000  Virtual SCSI Disk Drive
  vscsi0           U8305.E7B…………….V6-C40-T1                    Virtual SCSI Client Adapter
  hdisk0           U8305.E7B…………….V6-C40-T1-L8500000000000000  Virtual SCSI Disk Drive
  ent3             U8305.E7B…………….V6-C40-T1                     Virtual I/O Ethernet Adapter (l-lan)
  ent2             U8305.E7B…………….V6-C40-T1                     Virtual I/O Ethernet Adapter (l-lan)
  ent1             U8305.E7B…………….V6-C40-T1                     Virtual I/O Ethernet Adapter (l-lan)
  ent0             U8305.E7B…………….V6-C40-T1                     Virtual I/O Ethernet Adapter (l-lan)
  vsa0             U8305.E7B.069D7AP-V5-C1                        LPAR Virtual Serial Adapter
  vty0             U8305.E7B.069D7AP-V5-D1-L0                     Asynchronous Terminal
  L2cache0                                                        L2 Cache
  mem0                                                            Memory
  proc0                                                           Processor
  proc4                                                           Processor

  PLATFORM SPECIFIC

  Name:  IBM,8305-E7B
    Model:  IBM,8305-E7B
    Node:  /
    Device Type:  chrp

  Name:  openprom
    Model:  IBM,AL730_158
    Node:  openprom

  Name:  interrupt-controller
    Model:  IBM, Logical PowerPC-PIC, 00
    Node:  interrupt-controller@0
    Device Type:  PowerPC-External-Interrupt-Presentation

  Name:  vty
    Node:  vty@30000000
    Device Type:  serial
    Physical Location: …………………………………………..

  Name:  l-lan
    Node:  l-lan@30000002
    Device Type:  network
    Physical Location: …………………………………………..

  Name:  l-lan
    Node:  l-lan@30000003
    Device Type:  network
    Physical Location: …………………………………………..

  Name:  l-lan
    Node:  l-lan@30000004
    Device Type:  network
    Physical Location: …………………………………………..

  Name:  l-lan
    Node:  l-lan@30000005
    Device Type:  network
    Physical Location: …………………………………………..

  Name:  v-scsi
    Node:  v-scsi@3000005a
    Device Type:  vscsi
    Physical Location: …………………………………………..

  Name:  v-scsi
    Node:  v-scsi@3000005b
    Device Type:  vscsi
    Physical Location: …………………………………………..

  Name:  v-scsi
    Node:  v-scsi@30000014
    Device Type:  vscsi
    Physical Location: ………………………………..

  Name:  v-scsi
    Node:  v-scsi@30000017
    Device Type:  vscsi
    Physical Location: …………………………………

 


Another useful command I found is to list the equivalent of Linux's LVM Logical Volumes configured on the system, below is how:

aix:/# lspv hdisk0
00f68c6a84acb0d5 rootvg active hdisk1 00f69d6a85400468 dsvg active

To get more info on a volume group:

aix:/# lspv hdisk0 PHYSICAL VOLUME: hdisk0 VOLUME GROUP: rootvg PV IDENTIFIER: 00f68d6a85acb0d5 VG IDENTIFIER 00f68d6a00004c0000000131353444a5 PV STATE: active STALE PARTITIONS: 0 ALLOCATABLE: yes PP SIZE: 32 megabyte(s) LOGICAL VOLUMES: 12 TOTAL PPs: 959 (30688 megabytes) VG DESCRIPTORS: 2 FREE PPs: 493 (15776 megabytes) HOT SPARE: no USED PPs: 466 (14912 megabytes) MAX REQUEST: 256 kilobytes FREE DISTRIBUTION: 191..00..00..110..192 USED DISTRIBUTION: 01..192..191..82..00 MIRROR POOL: None


You can get which local configured partition is set on which ( PV )Physical Volume

aix:/# lspv -l hdisk0
hdisk0:
LV NAME               LPs     PPs     DISTRIBUTION          MOUNT POINT
lg_dumplv             64      64      00..64..00..00..00    N/A
hd8                   1       1       00..00..01..00..00    N/A
hd6                   16      16      00..16..00..00..00    N/A
hd2                   166     166     00..45..89..32..00    /usr
hd4                   29      29      00..11..18..00..00    /
hd3                   40      40      00..04..04..32..00    /tmp
hd9var                55      55      00..00..37..18..00    /var
hd10opt               74      74      00..37..37..00..00    /opt
hd1                   8       8       00..07..01..00..00    /home
hd5                   1       1       01..00..00..00..00    N/A

Command to get CPU server load in % percentage using bash and /proc/stat on Linux

Wednesday, March 11th, 2015

Command-to-get-CPU-server-load-in-percentage-using-bash-shell-script-and-linux-proc-stat

Getting load avarage is easy with uptime command, however since nowadays Linux servers are running on multiple CPU machines and Dual cores, returned load avarage shows only information concerning a single processor. Of course seeing overall CPU server load is possible with TOP / TLoad command  / HTOP and a bunch of other monitoring commands, but how you can get a CPU percentage server load using just  /proc/stat and bash scripting? Here is hwo:
 

:;sleep=1;CPU=(`cat /proc/stat | head -n 1`);PREV_TOTAL=0;for VALUE in "${CPU[@]}”; do let “PREV_TOTAL=$PREV_TOTAL+$VALUE”;done;PREV_IDLE=${CPU[4]};sleep $sleep; CPU=(`cat /proc/stat | head -n 1`);unset CPU[0];IDLE=${CPU[4]};TOTAL=0; for VALUE in “${CPU[@]}"; do let "TOTAL=$TOTAL+$VALUE"; done;echo $(echo "scale=2; ((($sleep*1000)*(($TOTAL-$PREV_TOTAL)-($IDLE-$PREV_IDLE))/($TOTAL-$PREV_TOTAL))/10)" | bc -l );

52.45

As you can see command output shows CPU is loaded on 52.45%, so this server will soon have to be replaced with better hardware, because it gets CPU loaded over 50%

It is useful to use above bash shell command one liner together with little for loop to refresh output every few seconds and see how the CPU is loaded in percentage over time.

 

for i in $(seq 0 10); do :;sleep=1;CPU=(`cat /proc/stat | head -n 1`);PREV_TOTAL=0;for VALUE in "${CPU[@]}”; do let “PREV_TOTAL=$PREV_TOTAL+$VALUE”;done;PREV_IDLE=${CPU[4]};sleep $sleep; CPU=(`cat /proc/stat | head -n 1`);unset CPU[0];IDLE=${CPU[4]};TOTAL=0; for VALUE in “${CPU[@]}"; do let "TOTAL=$TOTAL+$VALUE"; done;echo $(echo "scale=2; ((($sleep*1000)*(($TOTAL-$PREV_TOTAL)-($IDLE-$PREV_IDLE))/($TOTAL-$PREV_TOTAL))/10)" | bc -l ); done

47.50

13.86
27.36
82.67
77.18

To monitor "forever" output from all server processor overall load use:
 

while [ 1 ]; do :;sleep=1;CPU=(`cat /proc/stat | head -n 1`);PREV_TOTAL=0;for VALUE in “${CPU[@]}”; do let “PREV_TOTAL=$PREV_TOTAL+$VALUE”;done;PREV_IDLE=${CPU[4]};sleep $sleep; CPU=(`cat /proc/stat | head -n 1`);unset CPU[0];IDLE=${CPU[4]};TOTAL=0; for VALUE in “${CPU[@]}"; do let "TOTAL=$TOTAL+$VALUE"; done;echo $(echo "scale=2; ((($sleep*1000)*(($TOTAL-$PREV_TOTAL)-($IDLE-$PREV_IDLE))/($TOTAL-$PREV_TOTAL))/10)" | bc -l ); done

 

 

What is VT-x (Intel Virtualization) and AMD V (AMD Virtualization)

Wednesday, June 4th, 2014

what-is-vt-x-inel-amd-virtualization-amd-v
As I'm lately educating myself in field of Virtualziation and Virtual Machines, the interesting question poped up What is Virtualization on a Hardware Level and what are Intel's and AMD technologies supporting it?

 

  • Intel Virtualialization (Vt-x)

Is Intel's hardware assistance for processors running virtualization platforms. Intel's Virtualization for short is know as VT-x. Intel VT-x extensions are probably the best recognized extensions, adding migration, priority and memory handling capabilities to a wide range of Intel processors.
Intel VT includes series of extensions for hardware virtualization adding virtualization support to Intel chipsets, so that Virtual Machines could assign specific I/O Devices. Intel VT includes a series of extensions for hardware virtualization Intel Virtualization is better described here.
 

  • AMD-V (AMD virtualization)


Is a set of hardware extensions for the X86 processor architecture. Advanced Micro Dynamics (AMD) designed the extensions to perform repetitive tasks normally performed by software and improve resource use and virtual machine (VM) performance. Early virtualization efforts relied on software emulation to replace hardware functionality. But software emulation can be a slow and inefficient process. Because many virtualization tasks were handled through software, VM behavior and resource control were often poor, resulting in unacceptable VM performance on the server. AMD Virtualization (AMD-V) technology was first announced in 2004 and added to AMD's Pacifica 64-bit x86 processor designs. By 2006, AMD's Athlon 64 X2 and Athlon 64 FX processors appeared with AMD-V technology, and today, the technology is available on Turion 64 X2, second- and third-generation Opteron, Phenom and Phenom II processors. Just like with Intel Virtualization AMD-V Technology enables extra hardware support for assignment of specifics I/O on per virtualized OS. AMD V Virtualization is described more thoroughly here

 

Tracking I/O hard disk server bottlenecks with iostat on GNU / Linux and FreeBSD

Tuesday, March 27th, 2012

Hard disk overhead tracking on Linux and FreeBSD with iostat

I've earlier wrote an article How to find which processes are causing hard disk i/o overhead on Linux there I explained very rawly few tools which can be used to benchmark hard disk read / write operations. My prior article accent was on iotop and dstat and it just mentioned of iostat. Therefore I've wrote this short article in attempt to explain a bit more thoroughfully on how iostat can be used to track problems with excessive server I/O read/writes.

Here is the command man page description;
iostatReport Central Processing Unit (CPU) statistics and input/output statistics for devices, partitions and network filesystems

I will further proceed with few words on how iostat can be installed on various Linux distros, then point at few most common scenarious of use and a short explanation on the meaning of each of the command outputs.

1. Installing iostat on Linux

iostat is a swiss army knife of finding a server hard disk bottlenecks. Though it is a must have tool in the admin outfut, most of Linux distributions will not have iostat installed by default.
To have it on your server, you will need to install sysstat package:

a) On Debian / Ubuntu and other Debian GNU / Linux derivatives to install sysstat:

debian:~# apt-get --yes install sysstat

b) On Fedora, CentOS, RHEL etc. install is with yum:

[root@centos ~]# yum -y install sysstat

c) On Slackware Linux sysstat package which contains iostat is installed by default. 

d) In FreeBSD, there is no need for installation of any external package as iostat is part of the BSD world (bundle commands).
I should mention bsd iostat and Linux's iostat commands are not the same and hence there use to track down hard disk bottlenecks differs a bit, however the general logic of use is very similar as with most tools in BSD and Linux.

2. Checking a server hard disk for i/o disk bottlenecks on G* / Linux

Once having the sysstat installed on G* / Linux systems, the iostat command will be added in /usr/bin/iostat
a) To check what is the hard disk read writes per second (in megabytes) use:

debian:~# /usr/bin/iostat -m
Linux 2.6.32-5-amd64 (debian) 03/27/2012 _x86_64_ (8 CPU)
avg-cpu: %user %nice %system %iowait %steal %idle
15.34 0.36 2.76 2.66 0.00 78.88
Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn
sda 63.89 0.48 8.20 6730223 115541235
sdb 64.12 0.44 8.23 6244683 116039483
md0 2118.70 0.22 8.19 3041643 115528074

In the above output the server, where I issue the command is using sda and sdb configured in software RAID 1 array visible in the output as (md0)

The output of iostat should already be easily to read, for anyone who didn't used the tool here is a few lines explanation of the columns:

The %user 15.34 meaning is that 15.34 out of 100% possible i/o load is generad by system level read/write operations.
%nice – >Show the percentage of CPU utilization that occurred while executing at the user level with nice priority.
%iowait – just like the top command idle it shows the idle time when the system didn't have an outstanding disk I/O requests.
%steal – show percentage in time spent in time wait of CPU or virtual CPUs to service another virtual processor (high numbers of disk is sure sign for i/o problem).
%idle – almost the same as meaning to %iowait
tps – HDD transactions per second
MB_read/s (column) – shows the actual Disk reads in Mbytes at the time of issuing iostat
MB_wrtn/s – displays the writes p/s at the time of iostat invocation
MB_read – shows the hard disk read operations in megabytes, since the server boot 'till moment of invocation of iostat
MB_wrtn – gives the number of Megabytes written on HDD since the last server boot filesystem mount

The reason why the Read / Write values for sda and sdb are similar in this example output is because my disks are configured in software RAID1 (mirror)

The above iostat output reveals in my specific case the server is experiencing mostly Disk writes (observable in the high MB_wrtn/s 8.19 md0 in the above sample output).

It also reveals, the I/O reads experienced on that server hard disk are mostly generated as a system (user level load) – see (%user 15.34 and md0 2118.70).

For all those not familiar with system also called user / level load, this is all kind of load which is generated by running programs on the server – (any kind of load not generated by the Linux kernel or loaded kernel modules).

b) To periodically keep an eye on HDD i/o operations with iostat, there are two ways:

– Use watch in conjunction with iostat;

[root@centos ~]# watch "/usr/bin/iostat -m"
Every 2.0s: iostat -m Tue Mar 27 11:00:30 2012
Linux 2.6.32-5-amd64 (centos) 03/27/2012 _x86_64_ (8 CPU)
avg-cpu: %user %nice %system %iowait %steal %idle
15.34 0.36 2.76 2.66 0.00 78.88
Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn
sda 63.89 0.48 8.20 6730255 115574152
sdb 64.12 0.44 8.23 6244718 116072400
md0 2118.94 0.22 8.20 3041710 115560990
Device: tps MB_read/s MB_wrtn/s MB_read MB_wrtn
sda 55.00 0.01 25.75 0 51
sdb 52.50 0.00 24.75 0 49
md0 34661.00 0.01 135.38 0 270

Even though watch use and -d might appear like identical, they're not watch does refresh the screen, executing instruction similar to the clear command which clears screen on every 2 seconds, so the output looks like the top command refresh, while passing the -d 2 will output the iostat command output on every 2 secs in a row so all the data is visualized on the screen. Hence -d 2 in cases, where more thorough debug is necessery is better. However for a quick routine view watch + iostat is great too.

c) Outputting extra information for HDD input/output operations;

root@debian:~# iostat -x
Linux 2.6.32-5-amd64 (debian) 03/27/2012 _x86_64_ (8 CPU)
avg-cpu: %user %nice %system %iowait %steal %idle
15.34 0.36 2.76 2.66 0.00 78.88
Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util
sda 4.22 2047.33 12.01 51.88 977.44 16785.96 278.03 0.28 4.35 3.87 24.72
sdb 3.80 2047.61 11.97 52.15 906.93 16858.32 277.05 0.03 5.25 3.87 24.84
md0 0.00 0.00 20.72 2098.28 441.75 16784.05 8.13 0.00 0.00 0.00 0.00

This command will output extended useful Hard Disk info like;
r/s – number of read requests issued per second
w/s – number of write requests issued per second
rsec/s – numbers of sector reads per second
b>wsec/s – number of sectors wrote per second
etc. etc.

Most of ppl will never need to use this, but it is good to know it exists.

3. Tracking read / write (i/o) hard disk bottlenecks on FreeBSD

BSD's iostat is a bit different in terms of output and arguments.

a) Here is most basic use:

freebsd# /usr/sbin/iostat
tty ad0 cpu
tin tout KB/t tps MB/s us ni sy in id
1 561 45.18 44 1.95 14 0 5 0 82

b) Periodic watch of hdd i/o operations;

freebsd# iostat -c 10
tty ad0 cpu
tin tout KB/t tps MB/s us ni sy in id
1 562 45.19 44 1.95 14 0 5 0 82
0 307 51.96 113 5.73 44 0 24 0 32
0 234 58.12 98 5.56 16 0 7 0 77
0 43 0.00 0 0.00 1 0 0 0 99
0 485 0.00 0 0.00 2 0 0 0 98
0 43 0.00 0 0.00 0 0 1 0 99
0 43 0.00 0 0.00 0 0 0 0 100
...

As you see in the output, there is information like in the columns tty, tin, tout which is a bit hard to comprehend.
Thanksfully the tool has an option to print out only more essential i/o information:

freebsd# iostat -d -c 10
ad0
KB/t tps MB/s
45.19 44 1.95
58.12 97 5.52
54.81 108 5.78
0.00 0 0.00
0.00 0 0.00
0.00 0 0.00
20.48 25 0.50

The output info is quite self-explanatory.

Displaying a number of iostat values for hard disk reads can be also achieved by omitting -c option with:

freebsd# iostat -d 1 10
...

Tracking a specific hard disk partiotion with iostat is done with:

freebsd# iostat -n /dev/ad0s1a
tty cpu
tin tout us ni sy in id
1 577 14 0 5 0 81
c) Getting Hard disk read/write information with gstat

gstat is a FreeBSD tool to print statistics for GEOM disks. Its default behaviour is to refresh the screen in a similar fashion like top command, so its great for people who would like to periodically check all attached system hard disk and storage devices:

freebsd# gstat
dT: 1.002s w: 1.000s
L(q) ops/s r/s kBps ms/r w/s kBps ms/w %busy Name
0 10 0 0 0.0 10 260 2.6 15.6| ad0
0 10 0 0 0.0 10 260 2.6 11.4| ad0s1
0 10 0 0 0.0 10 260 2.8 12.5| ad0s1a
0 0 0 0 0.0 0 0 0.0 20.0| ad0s1b
0 0 0 0 0.0 0 0 0.0 0.0| ad0s1c
0 0 0 0 0.0 0 0 0.0 0.0| ad0s1d
0 0 0 0 0.0 0 0 0.0 0.0| ad0s1e
0 0 0 0 0.0 0 0 0.0 0.0| acd0

It even has colors if your tty supports colors 🙂

Another useful tool in debugging the culprit of excessive hdd I/O operations is procstat command:

Here is a sample procstat run to track (httpd) one of my processes imposing i/o hdd load:

freebsd# procstat -f 50404
PID COMM FD T V FLAGS REF OFFSET PRO NAME
50404 httpd cwd v d -------- - - - /
50404 httpd root v d -------- - - - /
50404 httpd 0 v c r------- 56 0 - -
50404 httpd 1 v c -w------ 56 0 - -
50404 httpd 2 v r -wa----- 56 75581 - /var/log/httpd-error.log
50404 httpd 3 s - rw------ 105 0 TCP ::.80 ::.0
50404 httpd 4 p - rw---n-- 56 0 - -
50404 httpd 5 p - rw------ 56 0 - -
50404 httpd 6 v r -wa----- 56 25161132 - /var/log/httpd-access.log
50404 httpd 7 v r rw------ 56 0 - /tmp/apr8QUOUW
50404 httpd 8 v r -w------ 56 0 - /var/run/accept.lock.49588
50404 httpd 9 v r -w------ 1 0 - /var/run/accept.lock.49588
50404 httpd 10 v r -w------ 1 0 - /tmp/apr8QUOUW
50404 httpd 11 ? - -------- 2 0 - -

Btw fstat is sometimes helpful in identifying the number of open files and trying to estimate which ones are putting the hdd load.
Hope this info helps someone. If you know better ways to track hdd excessive loads on Linux / BSD pls share 'em pls.
 

30 years anniversary of the first mass produced portable computer COMPAQ Grid Compass 1011

Thursday, July 19th, 2012

Grid Notebook Big screen logo

Today it is considered the modern laptop (portable computers) are turning 30 years old. The notebook grandparent is a COMPAQGRiD Compass 1011 – a “mobile computer” with a electroluminescent display (ELD) screen supporting resolution of 320×240 pixels. The screen allowed the user to use the computer console in a text resolution of 80×24 chars. This portable high-tech gadget was equipped with magnesium alloy case, an Inten 8086 CPU (XT processor) at 8Mhz (like my old desktop pravetz pc 😉 ), 340 kilobyte (internal non-removable magnetic bubble memory and even a 1,200 bit/s modem!

COMPAQ Grid Compass considered first laptop / notebook on earthy 30 anniversary of the portable computer

The machine was uniquely compatible for its time as one could easily attach devices such as floppy 5.25 inch drives and external (10 Meg) hard disk via IEEE-488 I/O compatible protocol called GPiB (General Purpose instrumental Bus).

First mass prdocued portable computer laptop grid COMPAQ 11011 back side input peripherals

The laptop had also unique small weight of only 5 kg and a rechargable batteries with a power unit (like modern laptops) connectable to a normal (110/220 V) room plug.

First notebook in World ever the COMPAQ grid Compass 1101,br />
The machine was bundled with an own specificly written OS GRiD-OS. GRID-OS could only run a specialized software so this made the application available a bit limited.
Shortly after market introduction because of the incompitablity of GRID-OS, grid was shipped with MS-DOS v. 2.0.
This primitive laptop computer was developed for serve mainly the needs of business users and military purposes (NASA, U.S. military) etc.

GRID was even used on Space Shuttles during 1980 – 1990s.
The price of the machine in April 1982 when GriD Compass was introduced was the shockingly high – $8150 dollars.

The machine hardware design is quite elegant as you can see on below pic:

 COMPAQ grid laptop 1101 bubbles internal memory

As a computer history geek, I’ve researched further on GRID Compass and found a nice 1:30 hour video telling in detailed presentation retelling the history.

Shortly after COMPAQ’s Grid Compass 1011’s introduction, many other companies started producing similar sized computers; one example for this was the Epson HX-20 notebook. 30 years later, probably around 70% of citizens on the globe owns a laptop or some kind of portable computer device (smartphone, tablet, ultra-book etc.).

Most of computer users owning a desktop nowdays, owns a laptop too for mobility reasons. Interestengly even 30 years later the laptop as we know it is still in a shape (form) very similar to its original predecessor. Today the notebook sales are starting to be overshadowed by tablets and ultra-books (for second quarter laptop sales raised 5% but if compared with 2011, the sales rise is lesser 1.8% – according to data provided by Digital Research agency). There are estimations done by (Forrester Research) pointing until the end of year 2015, sales of notebook substitute portable devices will exceed the overall sales of notebooks. It is manifested today the market dynamics are changing in favour of tabets and the so called next generation laptopsULTRA-BOOKS. It is a mass hype and a marketing lie that Ultra-Books are somehow different from laptops. The difference between a classical laptop and Ultra-Books is the thinner size, less weight and often longer battery use time. Actually Ultra-Books are copying the design concept of Mac MacBook Air trying to resell under a lound name.
Even if in future Ipads, Android tablets, Ultra-Books or whatever kind of mambo-jambo portable devices flood the market, laptops will still be heavily used in future by programmers, office workers, company employees and any person who is in need to do a lot of regular text editting, email use and work with corporative apps. Hence we will see a COMPAC Grid Compass 1011 notebook likes to be dominant until end of the decade.