Posts Tagged ‘general statistics’

Create SVN (Subversion) web statistics on Debian Lenny Linux with mpy-svn-stats and svnstats.sh

Monday, July 5th, 2010

I’ve recently desired to have a visualized statistics on how many commits, imports, people who commit into subversion’s repositories,graphs showing up the most active comitters, commits into the all subversion repositories grouped by month, week etc.
This kind of valuable information can give you insight, on a projects code life cycle. It can also help you to find out who takes most active participation into a certain project code development etc. and therefore could play a vital role in finding out about the work efficiency within your IT company or IT department.

There are plenty of softwares that can generate you some shiny statistics on how often and by whom are commits into your repositories as well as general statistics concerning your repositories accessibility.

Some of the projects suggested by most Linux users online, who had to resolve the same task e.g. (find some decent software to generate them good statistics on the svn use.) are:

1. statsvn

Here is a description on what statsvn is directly taken from its website:

StatSVN retrieves information from a Subversion repository and generates various tables and charts describing the project development, e.g.

StatSVN looks really promising, however what I find personally repulsive about it is that it depends on a Sun Java virtual machine
I have a bad taste for third party software that depends on java and therefore the software uses an XML dump generated from svn log –xml -v path/to/repos > svn-logfile.xml after which it’s necessary to pass the generated svn-logfile.xml file to statsvn, for instance:

statsvn [options] svn-logfile.xml path/to/repos

though a debian of statsvn is available and packaged for Debian in /usr/share/doc/statsvn/README.Debian we read:

Notes to Debian users:

* the jtreemap-based report has been disabled as jtreemap is currently
not packaged for Debian, and Debian cannot ship the applet without
its sources (not included in statsvn’s sources).

— Vincent Fourmond <fourmond@debian.org>, Tue, 4 Mar 2008 21:14:14 +0100

What I understood from statsvn documentation is that jtreemap is absolutely necessary in order to have a running statsvn, regardless if you have or you don’t have a java vm installed.

To take a general idea on what kind of Repo Roadmap does svnstat generates with jtreemap check out the following link

Since jtreemap is not available prepackaged for Debian I decided not to use svnstats though it looked quite superb.

Some further research on the kind of softwares available online able to generate me some statistics from cvs or subversion source repositories led me to,

2. svnplot

svnplot stroke me with it’s perfect looking many graphics generated on the Lines of Code commited, contribution of different authors to the repository, File count, avarage commit file sizes, common activity, author commit activity etc. etc.

I think tt’s worthy to check out some example statistics about a sample repository statistics generated by svnplot to get a better idea what to expect if you choose to install it on your server.

Even though svnplot looked also promising It wasn’t actually my choice because I think it’s not really mature enough as a software, the second reason which hold me back from installing it on my debian server was that I find it too much as a work in progress still.

Since neither svnstast nor svnplot didn’t well match my expectation and lacked a debian package I finally choose:

3. mpy svn stats as a solution to generate and graph information about svn usage

There are few reasons I finally took svn-mpy-stats to be the solution of choice.

1. It is available as a package in Debian Linux and easily installable via apt-get
2. It is written in Python and therefore doesn’t require a java virtual machine or some extra cosmetics to make it work3. It’s really simple and straight forward to configure and already tested and reported that it works well in Debian GNU/Linux

So here is the few simple steps I took to install mpy-svn-stats on Debian Lenny (in Debian Sid / Squeeze I suppose the procedure would be analogous.

– Install mpy-svn-stats via apt-get or aptitude

debian-server:~# apt-get install mpy-svn-stats

Run it for your svn repository with a command like:

debian-server:~# mkdir /var/www/svnstats
/usr/bin/mpy-svn-stats -o /var/www/svnstats/ file:///var/svn-repos/repository_name

In the above command substitute /var/www/svnstats/ and /var/svn-repos/repository_name with a directory of choice where you like to generate the html statistics for the svn usage as well as the full path and name of your repository.

Of course it’s a good idea to make mpy-svn-stats run periodically with for instance crontab or at or any other unix task cheduler available for your Linux system.

So far so good. You have probably already noticed that it’s rather inconvenient because you have to execute mpy-svn-stats command to each of your svn repositories individually.
This is absolute madness if your company is creating new svn source repository projects often, like let’s say everyday, because you will have to generate statistics for each of the repositories either manually or add new repositories manually to a script which will be later invoked by a crontab rule.

To get around this constrain, I’ve come up with a tiny shell script svnstats.sh which takes care for everything on it’s own.

It automatically will loop in your main subversion repositories directory through all the sub-repositories and generate individual html statistics in a separate automatically created directory by the script.

So to make your life easier and automate the process of generating stats with mpy-svn-stats consider downloading svnstats.sh and installing it as a separate rule like so:

debian-server:~# crontab -u root -e

Include therein the following:

# generate svn statistics everyday in 05:20 a.m.20 5 * * * /usr/sbin/svnstats.sh >/dev/null >>&1

Now everyday at 05:20 your mpy-svn-stats will generate a nice graphs and statistics for your subversion repository usage in /var/www/svstats, if you consider generating the data into a different location consider editting the head of mpy-svn-stats svnstats.sh script and change according to your likings.

Now let’s create an Alias in Apache to enable the (mpy-svn-stats generated by svnstats.sh) to be visualized via web:

– Edit VirtualHost configuration file of choice and put there, something like:

Alias /svnstats/ /var/www/svnstats/

Lastly it might be a good idea to use htaccess to protect your url with a password, afterwards you can enjoy your mpy svn statistics.

Monitoring MySQL server queries and debunning performance (slow query) issues with native MySQL commands and with mtop, mytop

Thursday, May 10th, 2012

If you're a Linux server administrator running MySQL server, you need to troubleshoot performance and bottleneck issues with the SQL database every now and then. In this article, I will pinpoint few methods to debug basic issues with MySQL database servers.

1. Troubleshooting MySQL database queries with native SQL commands

a)One way to debug errors and get general statistics is by logging in with mysql cli and check the mysql server status:

# mysql -u root -p
mysql> SHOW STATUS;
+-----------------------------------+------------+
| Variable_name | Value |
+-----------------------------------+------------+
| Aborted_clients | 1132 |
| Aborted_connects | 58 |
| Binlog_cache_disk_use | 185 |
| Binlog_cache_use | 2542 |
| Bytes_received | 115 |
.....
.....
| Com_xa_start | 0 |
| Compression | OFF |
| Connections | 150000 |
| Created_tmp_disk_tables | 0 |
| Created_tmp_files | 221 |
| Created_tmp_tables | 1 |
| Delayed_errors | 0 |
| Delayed_insert_threads | 0 |
| Delayed_writes | 0 |
| Flush_commands | 1 |
.....
.....
| Handler_write | 132 |
| Innodb_page_size | 16384 |
| Innodb_pages_created | 6204 |
| Innodb_pages_read | 8859 |
| Innodb_pages_written | 21931 |
.....
.....
| Slave_running | OFF |
| Slow_launch_threads | 0 |
| Slow_queries | 0 |
| Sort_merge_passes | 0 |
| Sort_range | 0 |
| Sort_rows | 0 |
| Sort_scan | 0 |
| Table_locks_immediate | 4065218 |
| Table_locks_waited | 196 |
| Tc_log_max_pages_used | 0 |
| Tc_log_page_size | 0 |
| Tc_log_page_waits | 0 |
| Threads_cached | 51 |
| Threads_connected | 1 |
| Threads_created | 52 |
| Threads_running | 1 |
| Uptime | 334856 |
+-----------------------------------+------------+
225 rows in set (0.00 sec)

SHOW STATUS; command gives plenty of useful info, however it is not showing the exact list of queries currently processed by the SQL server. Therefore sometimes it is exactly a stucked (slow queries) execution, you need to debug in order to fix a lagging SQL. One way to track this slow queries is via enabling mysql slow-query.log. Anyways enabling the slow-query requires a MySQL server restart and some critical productive database servers are not so easy to restart and the SQL slow queries have to be tracked "on the fly" so to say.
Therefore, to check the exact (slow) queries processed by the SQL server (without restarting it), do
 

mysql> SHOW processlist;
+——+——+—————+——+———+——+————–+——————————————————————————————————+
| Id | User | Host | db | Command | Time | State | Info |
+——+——+—————+——+———+——+————–+——————————————————————————————————+
| 609 | root | localhost | blog | Sleep | 5 | | NULL |
| 1258 | root | localhost | NULL | Sleep | 85 | | NULL |
| 1308 | root | localhost | NULL | Query | 0 | NULL | show processlist |
| 1310 | blog | pcfreak:64033 | blog | Query | 0 | Sending data | SELECT comment_author, comment_author_url, comment_content, comment_post_ID, comment_ID, comment_aut |
+——+——+—————+——+———+——+————–+——————————————————————————————————+
4 rows in set (0.00 sec)
mysql>

SHOW processlist gives a good view on what is happening inside the SQL.

To get more complete information on SQL query threads use the full extra option:

mysql> SHOW full processlist;

This gives pretty full info on running threads, but unfortunately it is annoying to re-run the command again and again – constantly to press UP Arrow + Enter keys.

Hence it is useful to get the same command output, refresh periodically every few seconds. This is possible by running it through the watch command:

debian:~# watch "'show processlist' | mysql -u root -p'secret_password'"

watch will run SHOW processlist every 2 secs (this is default watch refresh time, for other timing use watch -n 1, watch -n 10 etc. etc.

The produced output will be similar to:

Every 2.0s: echo 'show processlist' | mysql -u root -p'secret_password' Thu May 10 17:24:19 2012

Id User Host db Command Time State Info
609 root localhost blog Sleep 3 NULL1258 root localhost NULL Sleep 649 NULL1542 blog pcfreak:64981 blog Query 0 Copying to tmp table \
SELECT p.ID, p.post_title, p.post_content,p.post_excerpt, p.pos
t_date, p.comment_count, count(t_r.o
1543 root localhost NULL Query 0 NULL show processlist

Though this "hack" is one of the possible ways to get some interactivity on what is happening inside SQL server databases and tables table. for administering hundred or thousand SQL servers running dozens of queries per second – monitor their behaviour few times aday using mytop or mtop is times easier.

Though, the names of the two tools are quite similar and I used to think both tools are one and the same, actually they're not but both are suitable for monitoring sql database execution in real time.

As a sys admin, I've used mytop and mtop, on almost each Linux server with MySQL server installed.
Both tools has helped me many times in debugging oddities with sql servers. Therefore my personal view is mytop and mtop should be along with the Linux sysadmin most useful command tools outfit, still I'm sure many administrators still haven't heard about this nice goodies.

1. Installing mytop on Debian, Ubuntu and other deb based GNU / Linux-es

mytop is available for easy install on Debian and across all debian / ubuntu and deb derivative distributions via apt.

Here is info obtained with apt-cache show

debian:~# apt-cache show mytop|grep -i description -A 3
Description: top like query monitor for MySQL
Mytop is a console-based tool for monitoring queries and the performance
of MySQL. It supports version 3.22.x, 3.23.x, 4.x and 5.x servers.
It's written in Perl and support connections using TCP/IP and UNIX sockets.

Installing the tool is done with the trivial:

debian:~# apt-get --yes install mytop
....

mtop used to be available for apt-get-ting in Debian Lenny and prior Debian releases but in Squeeze onwards, only mytop is included (probably due to some licensing incompitabilities with mtop??).

For those curious on how mtop / mytop works – both are perl scripts written to periodically connects to the SQL server and run commands similar to SHOW FULL PROCESSLIST;. Then, the output is parsed and displayed to the user.

Here how mytop running, looks like:

MyTOP showing queries running on Ubuntu 8.04 Linux - Debugging interactively top like MySQL

2. Installing mytop on RHEL and CentOS

By default in RHEL and CentOS and probably other RedHat based Linux-es, there is neither mtop nor mytop available in package repositories. Hence installing the tools on those is only available from 3rd parties. As of time of writting an rpm builds for RHEL and CentOS, as well as (universal rpm distros) src.rpm package is available on http://pkgs.repoforge.org/mytop/. For the sake of preservation – if in future those RPMs disappear, I made a mirror of mytop rpm's here

Mytop rpm builds depend on a package perl(Term::ReadKey), my attempt to install it on CentOS 5.6, returned following err:

[root@cenots ~]# rpm -ivh mytop-1.4-2.el5.rf.noarch.rpm
warning: mytop-1.4-2.el5.rf.noarch.rpm: Header V3 DSA signature: NOKEY, key ID 6b8d79e6
error: Failed dependencies:
perl(Term::ReadKey) is needed by mytop-1.4-2.el5.rf.noarch

The perl(Term::ReadKey package is not available in CentOS 5.6 and (probably other centos releases default repositories so I had to google perl(Term::ReadKey) I found it on http://rpm.pbone.net/ package repository, the exact url to the rpm dependency as of time of writting this post is:

ftp://ftp.pbone.net/mirror/yum.trixbox.org/centos/5/old/perl-Term-ReadKey-2.30-2.rf.i386.rpm

Quickest, way to install it is:

[root@centos ~]# rpm -ivh ftp://ftp.pbone.net/mirror/yum.trixbox.org/centos/5/old/perl-Term-ReadKey-2.30-2.rf.i386.rpmRetrieving ftp://ftp.pbone.net/mirror/yum.trixbox.org/centos/5/old/perl-Term-ReadKey-2.30-2.rf.i386.rpmPreparing... ########################################### [100%]
1:perl-Term-ReadKey ########################################### [100%]

This time mytop, install went fine:

[root@centos ~]# rpm -ivh mytop-1.4-2.el5.rf.noarch.rpm
warning: mytop-1.4-2.el5.rf.noarch.rpm: Header V3 DSA signature: NOKEY, key ID 6b8d79e6
Preparing... ########################################### [100%]
1:mytop ########################################### [100%]

To use it further, it is the usual syntax:

mytop -u username -p 'secret_password' -d database

CentOS Linux MyTOP MySQL query benchmark screenshot - vpopmail query

3. Installing mytop and mtop on FreeBSD and other BSDs

To debug the running SQL queries in a MySQL server running on FreeBSD, one could use both mytop and mtop – both are installable via ports:

a) To install mtop exec:

freebsd# cd /usr/ports/sysutils/mtop
freebsd# make install clean
....

b) To install mytop exec:

freebsd# cd /usr/ports/databases/mytop
freebsd# make install clean
....

I personally prefer to use mtop on FreeBSD, because once run it runs prompts the user to interactively type in the user/pass

freebsd# mtop

Then mtop prompts the user with "interactive" dialog screen to type in user and pass:

Mtop interactive type in username and password screenshot on FreeBSD 7.2

It is pretty annoying, same mtop like syntax don't show user/pass prompt:

freebsd# mytop
Cannot connect to MySQL server. Please check the:

* database you specified "test" (default is "test")
* username you specified "root" (default is "root")
* password you specified "" (default is "")
* hostname you specified "localhost" (default is "localhost")
* port you specified "3306" (default is 3306)
* socket you specified "" (default is "")
The options my be specified on the command-line or in a ~/.mytop
config file. See the manual (perldoc mytop) for details.
Here's the exact error from DBI. It might help you debug:
Unknown database 'test'

The correct syntax to run mytop instead is:

freebsd# mytop -u root -p 'secret_password' -d 'blog'

Or the longer more descriptive:

freebsd# mytop --user root --pass 'secret_password' --database 'blog'

By the way if you take a look at mytop's manual you will notice a tiny error in documentation, where the three options –user, –pass and –database are wrongly said to be used as -user, -pass, -database:

freebsd# mytop -user root -pass 'secret_password' -database 'blog'
Cannot connect to MySQL server. Please check the:

* database you specified "atabase" (default is "test")
* username you specified "ser" (default is "root")
* password you specified "ass" (default is "")
* hostname you specified "localhost" (default is "localhost")
* port you specified "3306" (default is 3306)
* socket you specified "" (default is "")a
...
Access denied for user 'ser'@'localhost' (using password: YES)

Actually it is interesting mytop, precededed historically mtop.
mtop was later written (probably based on mytop), to run on FreeBSD OS by a famous MySQL (IT) spec — Jeremy Zawodny .
Anyone who has to do frequent MySQL administration tasks, should already heard Zawodny's name.
For those who haven't, Jeremy used to be a head database administrators and developer in Yahoo! Inc. some few years ago.
His website contains plenty of interesting thoughts and writtings on MySQL server and database management
 

BSD (Berkley Software Distribituion) use by distribution type (FreeBSD, NetBSD, OpenBSD, DrangflyBSD) use percantage charts

Saturday, February 18th, 2012

I've hit an interesting article in Wikipedia called Comparison of BSD operating systems
The article explains basic difference between different BSD (Berkley Software Distributions) and what is the primary accent of each of the BSD (free software OS) distributions. It also reveals basic details about the history and how each of the BSD's came to existence. I recommend to anyone interested in free software as it is just a great reading for everybody interested in FOSS.

The most interesting part of the wiki thread is a bar chart, provided by BSD Certification Group research conducted in September 2005.

FreeBSD, OpenBSD, NetBSD, Dragonflybsd usage statistics

The above diagram is showing the proportion of users of each BSD variant from the BSD usage survey prior conducted

The research is already 6 years old, and unfortunately as of time of writting seems to be the only publicly available. Though being outdated, I believe generally the bar charts distributions along different BSD variants would be mostly true. The only big difference will be probably in PC-BSD which is not even on the diagram should have outbeaten DragonflyBSD's use. Since there is no public data available for 2012 and the years 2005 – 2012 for the use percantage of each of the BSD distributions, I've thought about a pseudo way to get some general statistics on each of the BSD distributions popularity. The methodology to gather the required statistics comes to simply, type in Google each of the BSD variant "code names" (e.g. freebsd, netbsd, openbsd etc.) and look at the number of results returned. It seems logical the more results distribution keyword searched returns, the bigger the probability of more users to be involved in developing or using the respective BSD variant.

Below you see the results, I've gathered in my quick "google research":

FreeBSD NetBSD OpenBSD BSD variant (users) use diagram based on Google searches of keywords 2012

As you can see in the above data FreeBSD is still probably leading the BSD use, the public interest to OpenBSD – BSD focused on security has significantly grow since the last 6 years. Next it is seen the PC-BSD users base has probably tremendously increased and according to the Google results returned it is probably on a 3rd place by users interest (use?) followed by NetBSD with only 1.47% of all the BSD users. Lastly with only 0.99%, orders Dragonfly BSD which no longer is so popular as a Desktop BSD based OS as it used to be back in 2005.
Again the presented diagram results are based on only on the factor of Google BSD variant popularity and hence shouldn't be consired too trustworthy, still I'm sure it gives a general idea on how used is each of the BSD variants as of Jan 2012.