Posts Tagged ‘shell scripting’

Linux: Generating Web statistics from Old Apache logs with Webalizer

Thursday, July 25th, 2013

Webalizer generate and visualize in web page statistics of old websites howto webalizer static html google analytics like statistics on linux logo

Often it happens, that some old hosted websites were created in a way so no Web Statistics are available. Almost all modern created websites nowadays are already set to use Google AnalyticsAnyhow every now and then I stumble on hosting clients whose websites creator didn't thought on how to track how many hits or unique visitors site gets in a month / year etc.
 Thanksfully this is solvable by good "uncle" admin with help with of Webalizer (with custom configuration) and a little bit of shell scripting.

The idea is simple, we take the old website logs located in lets say 
/var/log/apache2/www.website-access.log*,
move files to some custom created new directory lets say /root/www.website-access-logs/ and then configure webalizer to read and generate statistics based on log in there.

For the purpose, we have to have webalizer installed on Linux system. In my case this is Debian GNU / Linux.

For those who hear of Webalizer for first time here is short package description:

debian:~# apt-cache show webalizer|grep -i description -A 2

Description-en: web server log analysis program
The Webalizer was designed to scan web server log files in various formats
and produce usage statistics in HTML format for viewing through a browser.

 If webalizer is not installed still install it with:

debian:~# apt-get install --yes webalizer
...
.....

Then make backup copy of original / default webalizer.conf (very important step especially if server is already processing Apache log files with some custom webalizer configuration:

debian:~# cp -rpf /etc/webalizer/webalizer.conf /etc/webalizer/webalizer.conf.orig

Next step is to copy webalizer.conf with a name reminding of website of which logs will be processed, e.g.:

debian:~# cp -rpf /etc/webalizer/webalizer.conf /etc/webalizer/www.website-webalizer.conf

In www.website-webalizer.conf config file its necessary to edit at least 4 variables:

LogFile /var/log/apache2/access.log
OutputDir /var/www
#Incremental no
ReportTitle Usage statistics for

 Make sure after modifying 3 vars read something like:  
LogFile /root/www.website/access_log_merged_1.log
OutputDir /var/www/www.website
Incremental yes
ReportTitle Usage statistics for Your-Website-Host-Name.com

Next create /root/www.website and /var/www/www.website, then copy all files you need to process from /var/log/apache2/www.website* to /root/www.website:

debian:~# mkdir -p /root/www.website
debian:~# cp -rpf /var/log/apache2/www.website* /root/www.website

On Debian Apache uses logrotate to archive old log files, so all logs except www.website-access.log and wwww.website-access.log.1 are gzipped:

debian:~#  cd /root/www.website
debian:~# ls 
www.website-access.log.10.gz
www.website-access.log.11.gz
www.website-access.log.12.gz
www.website-access.log.13.gz
www.website-access.log.14.gz
www.website-access.log.15.gz
www.website-access.log.16.gz
www.website-access.log.17.gz
www.website-access.log.18.gz
www.website-access.log.19.gz
www.website-access.log.20.gz
...
 

Then we have to un-gzip zipped logs and create one merged file from all of them ready to be red later by Webalizer. To do so I use a tiny shell script like so:

for n in {52..1}; do gzip -d www.dobrudzhatour.net-access.log.$n.gz; done
for n in {52..1}; do cat www.dobrudzhatour.net-access.log.$n >> access_log_merged_1.log;
done

First look de-gzips and second one does create a merged file from all with name access_merged_1.log The range of log files in my case is from www.website-access.log.1 to www.website-access.log.52, thus I have in loop back number counting from 52 to 1.

Once access_log_merged_1.log is ready we can run webalizer to process file (Incremental) and generate all time statistics for www.website:

debian:~# webalizer -c /etc/webalizer/webalizer-www.website-webalizer.conf

Webalizer V2.01-10 (Linux 2.6.32-27-server) locale: en_US.UTF-8
Using logfile /root/www.website/access_log_merged_1.log (clf)
Using default GeoIP database Creating output in /var/www/webalizer-www.website
Hostname for reports is 'debian'
Reading history file… webalizer.hist
Reading previous run data.. webalizer.current
333474 records (333474 ignored) in 37.50 seconds, 8892/sec

To check out just generated statistics open in browser:

http://yourserverhost/webalizer-www.website/

or

http://IP_Address/webalizer-www.website

 You should see statistics pop-up, below is screenshot with my currently generated stats:

Webalizer website access statistics screenshot Debian GNU Linux

Make daily Linux MySQL database backups with shell script

Thursday, May 23rd, 2013

Creating database backup with MySQL with mysqlbackupper and mysqlback shell scripts easy create mysql backups

Some time ago, I've written a tiny shell script which does dumps of Complete (SQL Script) MySQL databases. There are plenty of ways to backup MySQL database and plenty of scripts on the net but I like doing it my own way. I have few backup scripts. I prefer script database over keeping binary logs, or using some un-traditional backup methods like backing all binary data in /var/lib/mysql.

One was intended to backup with mysqldump whole database and later upload to a central server running tsh (shell). Using tsh maybe not the best method to upload, but the script can easily be modified to use ssh passwordless authentication as a method to upload.

I'm not a pro shell scripter, but MySQLBackupper script can be used as useful for learning some simple bash  shell scripting.

To use the script as intended you will have to build tsh from source. Tsh is in very early development stage (ver 0.2) but as far as I tested it before some years it does great what it is intended for. You can  MySQLBackupper.sh script from here.
Earlier, I used MysqlBackupper.sh to upload all SQL dumps to /backups directory on central backup storage server, thus I had written secondary script to classify uploaded backups based on backup archive name. Script used is mysqldumps-classify.sh and can be viewed here. Though this way of making backups, needs a bit of custom work for managing backups up to 10 / 20 servers it worked well.

I have written also another mysqlbackup script which is much more simplistic and only dumps with mysqldump and stores copies on hard disk in tar.gz archive. You can download my other simple mysqkbackup.sh here.

Only inconvenient thing about above scripts is they dump all SQL databases. Hence whether necessary to get content for single database from (complete) All database SQL (script backup), I use SED (stream editor) one liner script.

It is interesting to hear how others prepare their MySQL db backups.