Block Web server over loading Bad Crawler Bots and Search Engine Spiders with .htaccess rules

Monday, September 18th, 2017


In last post, I've talked about the problem of Search Index Crawler Robots aggressively crawling websites and how to stop them (the article is here) explaning how to raise delays between Bot URL requests to website and how to completely probhit some bots from crawling with robots.txt.

As explained in article the consequence of too many badly written or agressive behaviour Spider is the "server stoning" and therefore degraded Web Server performance as a cause or even a short time Denial of Service Attack, depending on how well was the initial Server Scaling done.

The bots we want to filter are not to be confused with the legitimate bots, that drives real traffic to your website, just for information

 The 10 Most Popular WebCrawlers Bots as of time of writting are:

1. GoogleBot (The Google Crawler bots, funnily bots become less active on Saturday and Sundays :))

2. BingBot ( Crawler bots)

3. SlurpBot (also famous as Yahoo! Slurp)

4. DuckDuckBot (The dutch search engine crawler bots)

5. Baiduspider (The Chineese most famous search engine used as a substitute of Google in China)

6. YandexBot (Russian Yandex Search engine crawler bots used in Russia as a substitute for Google )

7. Sogou Spider (leading Chineese Search Engine launched in 2004)

8. Exabot (A French Search Engine, launched in 2000, crawler for ExaLead Search Engine)

9. FaceBot (Facebook External hit, this crawler is crawling a certain webpage only once the user shares or paste link with video, music, blog whatever  in chat to another user)

10. Alexa Crawler (la_archiver is a web crawler for Amazon's Alexa Internet Rankings, Alexa is a great site to evaluate the approximate page popularity on the internet, Alexa SiteInfo page has historically been the Swift Army knife for anyone wanting to quickly evaluate a webpage approx. ranking while compared to other pages)

Above legitimate bots are known to follow most if not all of W3C – World Wide Web Consorium (W3.Org) standards and therefore, they respect the content commands for allowance or restrictions on a single site as given from robots.txt but unfortunately many of the so called Bad-Bots or Mirroring scripts that are burning your Web Server CPU and Memory mentioned in previous article are either not following /robots.txt prescriptions completely or partially.

Hence with the robots.txt unrespective bots, the case the only way to get rid of most of the webspiders that are just loading your bandwidth and server hardware is to filter / block them is by using Apache's mod_rewrite through




Create if not existing in the DocumentRoot of your website .htaccess file with whatever text editor, or create it your windows / mac os desktop and transfer via FTP / SecureFTP to server.

I prefer to do it directly on server with vim (text editor)



vim /var/www/sites/


RewriteEngine On

IndexIgnore .htaccess */.??* *~ *# */HEADER* */README* */_vti*

SetEnvIfNoCase User-Agent "^Black Hole” bad_bot
SetEnvIfNoCase User-Agent "^Titan bad_bot
SetEnvIfNoCase User-Agent "^WebStripper" bad_bot
SetEnvIfNoCase User-Agent "^NetMechanic" bad_bot
SetEnvIfNoCase User-Agent "^CherryPicker" bad_bot
SetEnvIfNoCase User-Agent "^EmailCollector" bad_bot
SetEnvIfNoCase User-Agent "^EmailSiphon" bad_bot
SetEnvIfNoCase User-Agent "^WebBandit" bad_bot
SetEnvIfNoCase User-Agent "^EmailWolf" bad_bot
SetEnvIfNoCase User-Agent "^ExtractorPro" bad_bot
SetEnvIfNoCase User-Agent "^CopyRightCheck" bad_bot
SetEnvIfNoCase User-Agent "^Crescent" bad_bot
SetEnvIfNoCase User-Agent "^Wget" bad_bot
SetEnvIfNoCase User-Agent "^SiteSnagger" bad_bot
SetEnvIfNoCase User-Agent "^ProWebWalker" bad_bot
SetEnvIfNoCase User-Agent "^CheeseBot" bad_bot
SetEnvIfNoCase User-Agent "^Teleport" bad_bot
SetEnvIfNoCase User-Agent "^TeleportPro" bad_bot
SetEnvIfNoCase User-Agent "^MIIxpc" bad_bot
SetEnvIfNoCase User-Agent "^Telesoft" bad_bot
SetEnvIfNoCase User-Agent "^Website Quester" bad_bot
SetEnvIfNoCase User-Agent "^WebZip" bad_bot
SetEnvIfNoCase User-Agent "^moget/2.1" bad_bot
SetEnvIfNoCase User-Agent "^WebZip/4.0" bad_bot
SetEnvIfNoCase User-Agent "^WebSauger" bad_bot
SetEnvIfNoCase User-Agent "^WebCopier" bad_bot
SetEnvIfNoCase User-Agent "^NetAnts" bad_bot
SetEnvIfNoCase User-Agent "^Mister PiX" bad_bot
SetEnvIfNoCase User-Agent "^WebAuto" bad_bot
SetEnvIfNoCase User-Agent "^TheNomad" bad_bot
SetEnvIfNoCase User-Agent "^WWW-Collector-E" bad_bot
SetEnvIfNoCase User-Agent "^RMA" bad_bot
SetEnvIfNoCase User-Agent "^libWeb/clsHTTP" bad_bot
SetEnvIfNoCase User-Agent "^asterias" bad_bot
SetEnvIfNoCase User-Agent "^httplib" bad_bot
SetEnvIfNoCase User-Agent "^turingos" bad_bot
SetEnvIfNoCase User-Agent "^spanner" bad_bot
SetEnvIfNoCase User-Agent "^InfoNaviRobot" bad_bot
SetEnvIfNoCase User-Agent "^Harvest/1.5" bad_bot
SetEnvIfNoCase User-Agent "Bullseye/1.0" bad_bot
SetEnvIfNoCase User-Agent "^Mozilla/4.0 (compatible; BullsEye; Windows 95)" bad_bot
SetEnvIfNoCase User-Agent "^Crescent Internet ToolPak HTTP OLE Control v.1.0" bad_bot
SetEnvIfNoCase User-Agent "^CherryPickerSE/1.0" bad_bot
SetEnvIfNoCase User-Agent "^CherryPicker /1.0" bad_bot
SetEnvIfNoCase User-Agent "^WebBandit/3.50" bad_bot
SetEnvIfNoCase User-Agent "^NICErsPRO" bad_bot
SetEnvIfNoCase User-Agent "^Microsoft URL Control – 5.01.4511" bad_bot
SetEnvIfNoCase User-Agent "^DittoSpyder" bad_bot
SetEnvIfNoCase User-Agent "^Foobot" bad_bot
SetEnvIfNoCase User-Agent "^WebmasterWorldForumBot" bad_bot
SetEnvIfNoCase User-Agent "^SpankBot" bad_bot
SetEnvIfNoCase User-Agent "^BotALot" bad_bot
SetEnvIfNoCase User-Agent "^lwp-trivial/1.34" bad_bot
SetEnvIfNoCase User-Agent "^lwp-trivial" bad_bot
SetEnvIfNoCase User-Agent "^Wget/1.6" bad_bot
SetEnvIfNoCase User-Agent "^BunnySlippers" bad_bot
SetEnvIfNoCase User-Agent "^Microsoft URL Control – 6.00.8169" bad_bot
SetEnvIfNoCase User-Agent "^URLy Warning" bad_bot
SetEnvIfNoCase User-Agent "^Wget/1.5.3" bad_bot
SetEnvIfNoCase User-Agent "^LinkWalker" bad_bot
SetEnvIfNoCase User-Agent "^cosmos" bad_bot
SetEnvIfNoCase User-Agent "^moget" bad_bot
SetEnvIfNoCase User-Agent "^hloader" bad_bot
SetEnvIfNoCase User-Agent "^humanlinks" bad_bot
SetEnvIfNoCase User-Agent "^LinkextractorPro" bad_bot
SetEnvIfNoCase User-Agent "^Offline Explorer" bad_bot
SetEnvIfNoCase User-Agent "^Mata Hari" bad_bot
SetEnvIfNoCase User-Agent "^LexiBot" bad_bot
SetEnvIfNoCase User-Agent "^Web Image Collector" bad_bot
SetEnvIfNoCase User-Agent "^The Intraformant" bad_bot
SetEnvIfNoCase User-Agent "^True_Robot/1.0" bad_bot
SetEnvIfNoCase User-Agent "^True_Robot" bad_bot
SetEnvIfNoCase User-Agent "^BlowFish/1.0" bad_bot
SetEnvIfNoCase User-Agent "^JennyBot" bad_bot
SetEnvIfNoCase User-Agent "^MIIxpc/4.2" bad_bot
SetEnvIfNoCase User-Agent "^BuiltBotTough" bad_bot
SetEnvIfNoCase User-Agent "^ProPowerBot/2.14" bad_bot
SetEnvIfNoCase User-Agent "^BackDoorBot/1.0" bad_bot
SetEnvIfNoCase User-Agent "^toCrawl/UrlDispatcher" bad_bot
SetEnvIfNoCase User-Agent "^WebEnhancer" bad_bot
SetEnvIfNoCase User-Agent "^TightTwatBot" bad_bot
SetEnvIfNoCase User-Agent "^suzuran" bad_bot
SetEnvIfNoCase User-Agent "^VCI WebViewer VCI WebViewer Win32" bad_bot
SetEnvIfNoCase User-Agent "^VCI" bad_bot
SetEnvIfNoCase User-Agent "^Szukacz/1.4" bad_bot
SetEnvIfNoCase User-Agent "^QueryN Metasearch" bad_bot
SetEnvIfNoCase User-Agent "^Openfind data gathere" bad_bot
SetEnvIfNoCase User-Agent "^Openfind" bad_bot
SetEnvIfNoCase User-Agent "^Xenu’s Link Sleuth 1.1c" bad_bot
SetEnvIfNoCase User-Agent "^Xenu’s" bad_bot
SetEnvIfNoCase User-Agent "^Zeus" bad_bot
SetEnvIfNoCase User-Agent "^RepoMonkey Bait & Tackle/v1.01" bad_bot
SetEnvIfNoCase User-Agent "^RepoMonkey" bad_bot
SetEnvIfNoCase User-Agent "^Zeus 32297 Webster Pro V2.9 Win32" bad_bot
SetEnvIfNoCase User-Agent "^Webster Pro" bad_bot
SetEnvIfNoCase User-Agent "^EroCrawler" bad_bot
SetEnvIfNoCase User-Agent "^LinkScan/8.1a Unix" bad_bot
SetEnvIfNoCase User-Agent "^Keyword Density/0.9" bad_bot
SetEnvIfNoCase User-Agent "^Kenjin Spider" bad_bot
SetEnvIfNoCase User-Agent "^Cegbfeieh" bad_bot


<Limit GET POST>
order allow,deny
allow from all
Deny from env=bad_bot


Above rules are Bad bots prohibition rules have RewriteEngine On directive included however for many websites this directive is enabled directly into VirtualHost section for domain/s, if that is your case you might also remove RewriteEngine on from .htaccess and still the prohibition rules of bad bots should continue to work
Above rules are also perfectly suitable wordpress based websites / blogs in case you need to filter out obstructive spiders even though the rules would work on any website domain with mod_rewrite enabled.

Once you have implemented above rules, you will not need to restart Apache, as .htaccess will be read dynamically by each client request to Webserver

2. Testing .htaccess Bad Bots Filtering Works as Expected

In order to test the new Bad Bot filtering configuration is working properly, you have a manual and more complicated way with lynx (text browser), assuming you have shell access to a Linux / BSD / *Nix computer, or you have your own *NIX server / desktop computer running

Here is how:


lynx -useragent="Mozilla/5.0 (compatible;; +" -head -dump



Note that lynx will provide a warning such as:

Warning: User-Agent string does not contain "Lynx" or "L_y_n_x"!

Just ignore it and press enter to continue.

Two other use cases with lynx, that I historically used heavily is to pretent with Lynx, you're GoogleBot in order to see how does Google actually see your website?

  • Pretend with Lynx You're GoogleBot


lynx -useragent="Mozilla/5.0 (compatible; Googlebot/2.1; +" -head -dump



  • How to Pretend with Lynx Browser You are GoogleBot-Mobile


lynx -useragent="Mozilla/5.0 (iPhone; U; CPU iPhone OS 4_1 like Mac OS X; en-us) AppleWebKit/532.9 (KHTML, like Gecko) Version/4.0.5 Mobile/8B117 Safari/6531.22.7 (compatible; Googlebot-Mobile/2.1; +" -head -dump


Or for the lazy ones that doesn't have Linux / *Nix at disposal you can use WannaBrowser website

Wannabrowseris a web based browser emulator which gives you the ability to change the User-Agent on each website req1uest, so just set your UserAgent to any bot browser that we just filtered for example set User-Agent to CheeseBot

The .htaccess rule earier added once detecting your browser client is coming in with the prohibit browser agent will immediately filter out and you'll be unable to access the website with a message like:

HTTP/1.1 403 Forbidden


Just as I've talked a lot about Index Bots, I think it is worthy to also mention three great websites that can give you a lot of Up to Date information on exact Spiders returned user-agent, common known Bot traits as well as a a current updated list with the Bad Bots etc.

Bot and Browser Resources information user-agents, bad-bots and odd Crawlers and Bots specifics



An updated list with robots user-agents (crawler-user-agents) is also available in github here regularly updated by Caia Almeido

There are also a third party plugin (modules) available for Website Platforms like WordPress / Joomla / Typo3 etc.

Besides the listed on these websites as well as the known Bad and Good Bots, there are perhaps a hundred of others that might end up crawling your webdsite that might or might not need  to be filtered, therefore before proceeding with any filtering steps, it is generally a good idea to monitor your  HTTPD access.log / error.log, as if you happen to somehow mistakenly filter the wrong bot this might be a reason for Website Indexing Problems.

Hope this article give you some valueable information. Enjoy ! 🙂


Finding top access IPs in Webserver or how to delay connects from Bots (Web Spiders) to your site to prevent connect Denial of Service

Friday, September 15th, 2017


If you're a sysadmin who has to deal with cracker attemps for DoS (Denial of Service) on single or multiple servers (clustered CDN or standalone) Apache Webservers, nomatter whether working for some web hosting company or just running your private run home brew web server its very useful thing to inspect Web Server log file (in Apache HTTPD case that's access.log).

Sometimes Web Server overloads and the follow up Danial of Service (DoS) affect is not caused by evil crackers (mistkenly often called hackers but by some data indexing Crawler Search Engine bots who are badly configured to aggressively crawl websites and hence causing high webserver loads flooding your servers with bad 404 or 400, 500 or other requests, just to give you an example of such obstructive bots.

1. Dealing with bad Search Indexer Bots (Spiders) with robots.txt

Just as I mentioned hackers word above I feel obliged to expose the badful lies the press and media spreading for years misconcepting in people's mind the word cracker (computer intruder) with a hacker, if you're one of those who mistakenly call security intruders hackers I recommend you read Dr. Richard Stallman's article On Hacking to get the proper understanding that hacker is an cheerful attitude of mind and spirit and a hacker could be anyone who has this kind of curious and playful mind out there. Very often hackers are computer professional, though many times they're skillful programmers, a hacker is tending to do things in a very undstandard and weird ways to make fun out of life but definitelely follow the rule of do no harm to the neighbor.

Well after the short lirical distraction above, let me continue;

Here is a short list of Search Index Crawler bots with very aggressive behaviour towards websites:


# mass download bots / mirroring utilities
1. webzip
2. webmirror
3. webcopy
4. netants
5. getright
6. wget
7. webcapture
8. libwww-perl
11. Teleport / TeleportPro
12. Zeus

Note that some of the listed crawler bots are actually a mirroring clients tools (wget) etc., they're also included in the list of server hammering bots because often  websites are attempted to be mirrored by people who want to mirror content for the sake of good but perhaps these days more often mirror (duplicate) your content for the sake of stealing, this is called in Web language Content Stealing in SEO language.

I've found a very comprehensive list of Bad Bots to block on Mike's tech blog his website provided example of bad robots.txt file is mirrored as plain text file here

Below is the list of Bad Crawler Spiders taken from his site:


# robots.txt to prohibit bad internet search engine spiders to crawl your website
# Begin block Bad-Robots from robots.txt
User-agent: asterias
User-agent: BackDoorBot/1.0
User-agent: Black Hole
User-agent: BlowFish/1.0
User-agent: BotALot
User-agent: BuiltBotTough
User-agent: Bullseye/1.0
User-agent: BunnySlippers
User-agent: Cegbfeieh
User-agent: CheeseBot
User-agent: CherryPicker
User-agent: CherryPickerElite/1.0
User-agent: CherryPickerSE/1.0
User-agent: CopyRightCheck
User-agent: cosmos
User-agent: Crescent
User-agent: Crescent Internet ToolPak HTTP OLE Control v.1.0
User-agent: DittoSpyder
User-agent: EmailCollector
User-agent: EmailSiphon
User-agent: EmailWolf
User-agent: EroCrawler
User-agent: ExtractorPro
User-agent: Foobot
User-agent: Harvest/1.5
User-agent: hloader
User-agent: httplib
User-agent: humanlinks
User-agent: InfoNaviRobot
User-agent: JennyBot
User-agent: Kenjin Spider
User-agent: Keyword Density/0.9
User-agent: LexiBot
User-agent: libWeb/clsHTTP
User-agent: LinkextractorPro
User-agent: LinkScan/8.1a Unix
User-agent: LinkWalker
User-agent: LNSpiderguy
User-agent: lwp-trivial
User-agent: lwp-trivial/1.34
User-agent: Mata Hari
User-agent: Microsoft URL Control – 5.01.4511
User-agent: Microsoft URL Control – 6.00.8169
User-agent: MIIxpc
User-agent: MIIxpc/4.2
User-agent: Mister PiX
User-agent: moget
User-agent: moget/2.1
User-agent: mozilla/4
User-agent: Mozilla/4.0 (compatible; BullsEye; Windows 95)
User-agent: Mozilla/4.0 (compatible; MSIE 4.0; Windows 95)
User-agent: Mozilla/4.0 (compatible; MSIE 4.0; Windows 98)
User-agent: Mozilla/4.0 (compatible; MSIE 4.0; Windows NT)
User-agent: Mozilla/4.0 (compatible; MSIE 4.0; Windows XP)
User-agent: Mozilla/4.0 (compatible; MSIE 4.0; Windows 2000)
User-agent: Mozilla/4.0 (compatible; MSIE 4.0; Windows ME)
User-agent: mozilla/5
User-agent: NetAnts
User-agent: NICErsPRO
User-agent: Offline Explorer
User-agent: Openfind
User-agent: Openfind data gathere
User-agent: ProPowerBot/2.14
User-agent: ProWebWalker
User-agent: QueryN Metasearch
User-agent: RepoMonkey
User-agent: RepoMonkey Bait & Tackle/v1.01
User-agent: RMA
User-agent: SiteSnagger
User-agent: SpankBot
User-agent: spanner
User-agent: suzuran
User-agent: Szukacz/1.4
User-agent: Teleport
User-agent: TeleportPro
User-agent: Telesoft
User-agent: The Intraformant
User-agent: TheNomad
User-agent: TightTwatBot
User-agent: Titan
User-agent: toCrawl/UrlDispatcher
User-agent: True_Robot
User-agent: True_Robot/1.0
User-agent: turingos
User-agent: URLy Warning
User-agent: VCI
User-agent: VCI WebViewer VCI WebViewer Win32
User-agent: Web Image Collector
User-agent: WebAuto
User-agent: WebBandit
User-agent: WebBandit/3.50
User-agent: WebCopier
User-agent: WebEnhancer
User-agent: WebmasterWorldForumBot
User-agent: WebSauger
User-agent: Website Quester
User-agent: Webster Pro
User-agent: WebStripper
User-agent: WebZip
User-agent: WebZip/4.0
User-agent: Wget
User-agent: Wget/1.5.3
User-agent: Wget/1.6
User-agent: WWW-Collector-E
User-agent: Xenu’s
User-agent: Xenu’s Link Sleuth 1.1c
User-agent: Zeus
User-agent: Zeus 32297 Webster Pro V2.9 Win32
Crawl-delay: 20
# Begin Exclusion From Directories from robots.txt
Disallow: /cgi-bin/

Veryimportant variable among the ones passed by above robots.txt is

Crawl-Delay: 20


You might want to tune that variable a Crawl-Delay of 20 instructs all IP connects from any Web Spiders that are respecting robots.txt variables to delay crawling with 20 seconds between each and every connect client request, that is really useful for the Webserver as less connects means less CPU and Memory usage and less degraded performance put by aggressive bots crawling your site like crazy, requesting resources 10 times per second or so …

As you can conclude by the naming of some of the bots having them disabled would prevent your domain/s clients from Email harvesting Spiders and other not desired activities.


2. Listing IP addresses Hits / How many connects per IPs used to determine problematic server overloading a huge number of IPs connects

After saying few words about SE bots and I think it it is fair to also  mention here a number of commands, that helps the sysadmin to inspect Apache's access.log files.
Inspecting the log files regularly is really useful as the number of malicious Spider Bots and the Cracker users tends to be
raising with time, so having a good way to track the IPs that are stoning at your webserver and later prohibiting them softly to crawl either via robots.txt (not all of the Bots would respect that) or .htaccess file or as a last resort directly form firewall is really useful to know.

– Below command Generate a list of IPs showing how many times of the IPs connected the webserver (bear in mind that commands are designed log fields order as given by most GNU / Linux distribution + Apache default logging configuration;


webhosting-server:~# cd /var/log/apache2 webhosting-server:/var/log/apache2# cat access.log| awk '{print $1}' | sort | uniq -c |sort -n

Below command provides statistics info based on whole access.log file records, sometimes you will need to have analyzed just a chunk of the webserver log, lets say last 12000 IP connects, here is how:

webhosting-server:~# cd /var/log/apache2 webhosting-server:/var/log/apache2# tail -n 12000 access.log| awk '{print $1}' | sort | uniq -c |sort -n

You can combine above basic bash shell parser commands with the watch command to have a top like refresh statistics every few updated refreshing IP statistics of most active customers on your websites.

Here is an example:


webhosting-server:~# watch "cat access.log| awk '{print $1}' | sort | uniq -c |sort -n";


Once you have the top connect IPs if you have a some IP connecting with lets say 8000-10000 thousand times in a really short interval of time 20-30 minues or so. Hence it is a good idea to investigate further where is this IP originating from and if it is some malicious Denial of Service, filter it out either in Firewall (with iptables rules) or ask your ISP or webhosting to do you a favour and drop all the incoming traffic from that IP.

Here is how to investigate a bit more about a server stoner IP;
Lets assume that you found IP: to be having too many connects to your webserver:

webhosting-server:~# grep -i /var/log/apache2/access.log|tail -n 1 – – [12/Sep/2017:07:42:13 +0300] "GET / HTTP/1.1" 403 371 "-" "Mozilla/5.0 (compatible;; +"


webhosting-server:~# host domain name pointer


webhosting-server:~# whois|less


The outout you will get would be something like:

% This is the RIPE Database query service.
% The objects are in RPSL format.
% The RIPE Database is subject to Terms and Conditions.
% See

% Note: this output has been filtered.
%       To receive output for a database update, use the "-B" flag.

% Information related to ' –'

% Abuse contact for ' –' is ''

inetnum: –
netname:        HETZNER-RZ15
descr:          Hetzner Online GmbH
descr:          Datacenter 15
country:        DE
admin-c:        HOAC1-RIPE
tech-c:         HOAC1-RIPE
status:         ASSIGNED PA
mnt-by:         HOS-GUN
mnt-lower:      HOS-GUN
mnt-routes:     HOS-GUN
created:        2012-03-12T09:45:54Z
last-modified:  2015-08-10T09:29:53Z
source:         RIPE

role:           Hetzner Online GmbH – Contact Role
address:        Hetzner Online GmbH
address:        Industriestrasse 25
address:        D-91710 Gunzenhausen
address:        Germany
phone:          +49 9831 505-0
fax-no:         +49 9831 505-3
remarks:        *************************************************
remarks:        * For spam/abuse/security issues please contact *
remarks:        *, not this address. *
remarks:        * The contents of your abuse email will be *
remarks:        * forwarded directly on to our client for *

3. Generate list of directories and files that are most called by clients

webhosting-server:~# cd /var/log/apache2; webhosting-server:/var/log/apache2# awk '{print $7}' access.log|cut -d? -f1|sort|uniq -c|sort -nk1|tail -n10

( take in consideration that this info is provided only on current records from /var/log/apache2/ and is short term for long term statistics you have to merge all existing gzipped /var/log/apache2/access.log.*.gz )

To merge all the old gzipped files into one single file and later use above shown command to analyize run:


cd /var/log/apache2/
cp -rpf *access.log*.gz apache-gzipped/
cd apache-gzipped
for i in $(ls -1 *access*.log.*.gz); do gzip -d $i; done
rm -f *.log.gz;
for i in $(ls -1 *|grep -v access_log_complete); do cat $i >> access_log_complete; done

Though the accent of above article is Apache Webserver log analyzing, the given command examples can easily be recrafted to work properly on other Web Servers LigHTTPD, Nginx etc.

Above commands are about to put a higher load to your server during execution, so on busy servers it is a better idea, to first go and synchronize the access.log files to another less loaded servers in most small and midsized companies this is being done by a periodic synchronization of the logs to the log server used usually only to store log various files and later used to do various analysis our run analyse software such as Awstats, Webalizer, Piwik, Go Access etc.

Worthy to mention one great text console must have Apache tool that should be mentioned to analyze in real time for the lazy ones to type so much is Apache-top but those script will be not installed on most webhosting servers and VPS-es, so if you don't happen to own a self-hosted dedicated server / have webhosting company etc. – (have root admin access on server), but have an ordinary server account you can use above commands to get an overall picture of abusive webserver IPs.


If you have a Linux with a desktop GUI environment and have somehow mounted remotely the weblog server partition another really awesome way to visualize in real time the connect requests to  web server Apache / Nginx etc. is with Logstalgia

Well that's all folks, I hope that article learned you something new. Enjoy

Thanks for article neo-tux picture to

Play the Dangerous Dave old arcade classic on iPhone, iPad and Android Smartphone – Dangerous Dave 1990’s computer arcade classic Mario like game phone Application

Thursday, April 27th, 2017


I still remember the good old times with my 16 Bit Desktop Personal Computer Parvetz 8086 CPU where one of the most favourite games I used to play a computer substitute for Mario for DOS operation system was Dangerous Dave 2 (DDAVE.EXE) an arcade game classic game from the distant year 1990 authored by a whiz kid which later become world famous Computer game Programmer John Romero mostly known for being a cofounder of Game creation comppany ID Software  which authored the 3D Shooter genesis classics such as Wolfenstein 3D, Spear of Destiny, DOOM I and DOOM II HeXen I / II, QUAKE I,  QUAKE II, QUAKE II as well as some absolute arcade classics as Commander Keen 4 🙂

As John Romero shared himsef the game is actually inspired by Super Mario Bros so he decided to create a kinda of computer remake of the game in his teenage years and he did a great job yeah 🙂

There are similarities between Super Mario and Dangerous Dave as both have  the secret levels, the level design, the monsters, and the jump all around collecting cups with a final aim to end up in the level exit door.

The game was originally developed for Apple II and later reworked and ported to DOS and because of it is immerse popularity Dave 2, 3 and 4 come out short

The game is really awesome and worths all praise, I was nicely surprised to find out Dangerous Dave amazing game is available for Iphone 5, 5S and Iphone 6 right into Appstore

Here is the awesome Dangerous DAVE Iphone port description:

"Dave is a redneck on a rampage to reclaim his stolen trophies from the town bully, Clyde! Dangerous Dave is back in his classic adventure in the Deserted Pirate's Hideout. This recreation of the original 1990 DOS game is just as action-packed and difficult as the original! There are only 10 levels, but, wow, are they hard. "


I have to say the game controls are pretty much amazing and the game controls even though reimplemented on the Iphone touch screen device are truly amazing so gameplay resembles pretty much the Computer original game keyboard controls and in a sense the touch screen controls are a little bit more convenient.

The iOS Dave port is pretty nice and updated version is also available which is possible to be chosen on Game entry screen so you either play classic mode or you play the Dave in the Deserted pirated hideout updated version and sound Dave remake, below is a screenshot of the updated GUI version:


Dave in the deserted pirate hideout Updated GUI shot by Alfonso Romero – level 1


Dave in the deserted pirate hideout Updated GUI shot by Alfonso Romero – level 2


Actually Dangerous Dave is also available for Android Smartphone devices even though the controllers suck a lot compared to the Iphone version if you happen to own an Android OS phone check here 

For those who don't own an Iphone or Android SmartPhone (lucky you) you can also play Dangerous Dave online via DOSBox Web emulation from this URL


For those who prefer to play Dangerous Dave as a standalone desktop application as in the good old times on Windows 7 / 8, 8.1 and Windows 10 both on 32 and 64 bits platform you can download it (as of moment of writting article) from here

A mirrored version of Dangerous Dave for Windows 7/8/10 on in case if it disappears in future check here.

Our generation people born in 1983-1986 who are now about 33 years old has grown up with this game and I'm pretty sure if you happen to be one of those people will truly enjoy to replay the quick 10 game levels and remind the fuzzy computer arcade games age when every growing kid like me was obsessed with the idea to play and complete as much as games possible with countless nights in front of the Green and Black screen and later on SVGA screens geeking on and on loosing idea of time and space and being completely sunk by the game.


Happy gaming ! 🙂

Some standard software programs to install on Windows to make your Windows feel more like a Linux / Unix Desktop host

Friday, March 17th, 2017


If you're Windows user like me with a Linux / FreeBSD / OpenBSD / NetBSD – a dedicated Unix user and end up working for financial reasons in some TOP 100 Fortune companies (CSC, SAP, IBM, Hewlett Packard,Enterprise, Oracle) etc.  and forced for business purposes (cause some programs such as Skype for Business Desktop Share does not run fine on Unix like and thus you have to work notebook pre-installed with Windows 7 / 8 or 10 but you're so accustomed to customizations already from UNIX environments and you would like to create yourself the Windows to resemble Linux and probably customize much of how Windows behaves by default.

Here is what I personally did on my work Windows 7 Enterprise on my HP Elitebook notebook to give myself the extra things I'm used to my Debian Linux Desktop.

1. Downloaded and instaled standard gnome-terminal xterm like immediately (E.g. check MobaXterm great alternative to Putty),
2. Changed cutomize Windows 7 appearance to be more like classical Windows XP,  change Windows 8 / 10 start menu appearance to be more like in classic Windows 2000
3. Installed following bunch of softwares

  • VIM Text Editor for Windows
  • Thunderbird Mail Client
  • OpenVPN client
  • Oracle VM Virtualbox
  • Opera
  • Mozilla Firefox
  • Password Safe
  • Ext2FS / Ext3FS (support programs)
  • F.lux (to auto adjust screen brightness day and night for better sleep)
  • install ActivePerl for Windows
  • Install GNUWin Tools (and perhaps most importantly)
  • CygWin,  (to provide Windows with most needed console Linux tools), Clink.
  • WinSCP
  • Swish (to be able to remotely mount your Linux partitions and see them as local Windows drives)
  • dosbox (to play some of the good old Dos games :))
  • Windirstat (to easily check the size of complete directory and subdirectories)
  • SpaceSniffer (to be able to see which directory or files are taking the most space on the system)

Along with all above goodies here is also some good software I find essential for every web developer / system administrator / network administrator or java,  C, php pprogrammer out there that's using Windows as his Desktop platrofm.

Another thing I prefer  on Windows 7 when used as workstation is to change the default Windows 7 LogonUI screen background as well check out how here

Perhaps there is plenty of other goodprograms to install on Windows to make it feel even more like a Linux / Unix Desktop host, if you happen to somehow stuck to this article and you've migrated from Llinux / BSD desktop to Windows for work purposes please share with me any other goodies you happen to use that is from *Unix.

Windows missing volume control on Windows 7, 8 Fix / How to run volume control from command line

Thursday, March 9th, 2017



Windows 7/8 Volume Icon disappear from Taskbar?

If you are using  Windows 7 or  Windows 8 Operating System inside a corporate network and your notebook PC is inside domain controller controlled by some crazy administrators who for some reason decided to remove the Taskbar from your Taskbar tray you have come over to exactly same situation like I do here.

Actually some might have experienced an icon "combined" feature which gives the opportunity of some of the standard Tray icons we know since Windows 98 / XP onwards to not show full time in order to save you space. No doubt this feature is great one to use as it is distracting sometimes to have a tons of applications constantly keeping in the Taskbar (right down corner) however if the Active Domain admin did it without any notification and you're a kind of victim you might dislike especially since this behaviour is making you impossible to easily control your phone / headspeakers and mic.



If you check in the Control Panel and click on Sounds  menu in Windows 7/8, you don't see any checkbox for adding the icon back as I have assumed, , but instead all the audio there you can only see the inputs and outputs on your system general settings.


This behavior was made on purpose and makes sense cauze the taskbar icons since Win XP (if not mistaken) has to be controlled by the taskbar settings pan.

Thus in order to bring back the disappeared icon on  Winblows 7 / Win 8 there is a taskbar properties feature enabling to to hide or view the various taskbar running apps in that number the Volume icon, hence to bring back your Volme Control speaker icon to taskbar you need to customize it.

To do so do a mouse Right-click anywhere on the taskbar and choose Properties.


Now, click on the Customize button under Notification area.


In  Notification Area Icons dialog box, there is 2 ticks to check. Assure yourself the volume icon default behavior is set to

Show icon and notifications like in below screenshot


To make the new behaviour active click on Turn system icons on or off.


One thing to note here is the volume icon shoukld be set to On like in below| shot:


If the reason for the disapperance of the Volume controller in task is not due to Domain Controllear policty it could happen due to late updates pushed by Microsoft if the PC needs a restart or after computer Log off operation.
Another reason for the casual disappearance of sound box could be also a buggy driver, so if the icon keeps disappearing over and over again, you better try to update the driver for your sound card.

However if you end up in a Windows Domain Controller (AD) Policy that is prohibiting the Sound Voulme to appear on your taskbar like in my case all the above won't help you solve it, but luckily there is an easy way to invoke the Volume Control dialog box via




the command will bring up the Volume Control as in upper left corner of screen like in below screenshot:



If you to show it with a silder use -f flag

sndvol.exe -f

Running just


opens the volume mixer, as you noted.


On windows XP the respective command to open a missing Volume Control dialog in taskbar, use instead:


command from Windows Command Prompt:


Start -> Run -> cmd.exe



no params to display master volume window



sndvol32 -x

to display small master volume window

sndvol32 -t

to display volume control only (as per sound icon)

If you have the Volume Controller behavior to be hidden or you need to view any other taskbar hidden application icon  it will be useful for you to use:

AutoHotKey Win+B to focus on the system tray, Left (arrow) to highlight the Volume Control icon icon, and then Enter to bring up the popup.


A good tip you might be interestted to use occasionally is  how to show the current Wireless networks via a command (if that's prohibited otherwise via GUI) so you can easily see the  Connected Networks on Windows using cmd:

rundll32 van.dll,RunVAN

Must have software on freshly installed windows – Essential Software after fresh Windows install

Friday, March 18th, 2016


If you're into IT industry even if you don't like installing frequently Windows or you're completely Linux / BSD user, you will certainly have a lot of friends which will want help from you to re-install or fix their Windows 7 / 8 / 10 OS. At least this is the case with me every year, I'm kinda of obliged to install fresh windowses on new bought friends or relatives notebooks / desktop PCs.

Of course according to for whom the new Windows OS installed the preferrences of necessery software varies, however more or less there is sort of standard list of Windows Software which is used daily by most of Avarage Computer user, such as:

I tend to install on New Windows installs and thus I have more or less systematized the process.

I try to usually stick to free software where possible for each of the above categories as a Free Software enthusiast and luckily nowadays there is a lot of non-priprietary or at least free as in beer software available out there.

For Windows sysadmins or College and other public institutions networks including multiple of Windows Computers which are not inside a domain and also for people in computer repair shops where daily dozens of windows pre-installs or a set of software Automatic updates are  necessery make sure to take a look at Ninite


As official website introduces Ninite:

Ninite – Install and Update All Your Programs at Once

Of course as Ninite is used by organizations as NASA, Harvard Medical School etc. it is likely the tool might reports your installed list of Windows software and various other Win PC statistical data to Ninite developers and most likely NSA, but this probably doesn't much matter as this is probably by the moment you choose to have installed a Windows OS on your PC.


For Windows System Administrators managing small and middle sized network PCs that are not inside a Domain Controller, Ninite could definitely save hours and at cases even days of boring install and maintainance work. HP Enterprise or HP Inc. Employees or ex-employees would definitely love Ninite, because what Ninite does is pretty much like the well known HP Internal Tool PC COE.

Ninite could also prepare an installer containing multiple applications based on the choice on Ninite's website, so that's also a great thing especially if you need to deploy a different type of Users PCs (Scientific / Gamers / Working etc.)

Perhaps there are also other useful things to install on a new fresh Windows installations, if you're using something I'm missing let me know in comments.

Windows: command to show CPU info, PC Motherboard serial number and BIOS details

Wednesday, March 2nd, 2016


Getting CPU information, RAM info and other various hardware specifics on Windows from the GUI interface is pretty trivial from Computer -> Properties
even more specifics could be obtained using third party Windows software such as CPU-Z

Perhaps there are plenty of many other ones to get and log info about hardware on PC or notebook system, but for Windwos sysadmins especially ones who are too much in love with command prompt way of behaving and ones who needs to automatizate server deployment processes with BATCH (.BAT)  scripts getting quickly info about hardware on freshly installed remote host Win server with no any additional hardware info tools, you'll be happy to know there are command line tools you can use to get extra hardware information on Windows PC / server:

The most popular tool available to present you with some basic hardware info is of course systeminfo


C:\> systeminfo

Host Name:                 REMHOST
OS Name:                   Microsoft Windows Server 2012 R2 Standard
OS Version:                6.3.9600 N/A Build 9600
OS Manufacturer:           Microsoft Corporation
OS Configuration:          Member Server
OS Build Type:             Multiprocessor Free
Registered Owner:          Registrar
Registered Organization:   Registrar
Product ID:                00XXX-X0000-00000-XX235
Original Install Date:     17/02/2016, 11:38:39
System Boot Time:          18/02/2016, 14:16:48
System Manufacturer:       VMware, Inc.
System Model:              VMware Virtual Platform
System Type:               x64-based PC
Processor(s):              1 Processor(s) Installed.
                           [01]: Intel64 Family 6 Model 45 Stepping 7 GenuineInt
el ~2600 Mhz
BIOS Version:              Phoenix Technologies LTD 6.00, 11/06/2014
Windows Directory:         C:\Windows
System Directory:          C:\Windows\system32
Boot Device:               \Device\HarddiskVolume1
System Locale:             de;German (Germany)
Input Locale:              de;German (Germany)
Time Zone:                 (UTC+01:00) Amsterdam, Berlin, Bern, Rome, Stockholm,
Total Physical Memory:     4,095 MB
Available Physical Memory: 2,395 MB
Virtual Memory: Max Size:  10,239 MB
Virtual Memory: Available: 8,681 MB
Virtual Memory: In Use:    1,558 MB
Page File Location(s):     C:\pagefile.sys
Logon Server:              \\DOM
Hotfix(s):                 148 Hotfix(s) Installed.
                           [01]: KB2894852
                           [02]: KB2894856
                           [03]: KB2918614
                           [04]: KB2919355

Now though systeminfo's hardware details and installed Windows KBXXXXX OS Hotfix patches are getting lists the command does not provide you with info about  system’s BIOS, thus to get this info you'll have to use also wmic (Windows Management Instrumentation Command).


So What Is WMIC?

WMIC extends WMI for operation from several command-line interfaces and through batch scripts. Before WMIC, you used WMI-based applications (such as SMS), the WMI Scripting API, or tools such as CIM Studio to manage WMI-enabled computers. Without a firm grasp on a programming language such as C++ or a scripting language such as VBScript and a basic understanding of the WMI namespace, do-it-yourself systems management with WMI was difficult. WMIC changes this situation by giving you a powerful, user-friendly interface to the WMI namespace.

WMIC is more intuitive than WMI, in large part because of aliases. Aliases take simple commands that you enter at the command line, then act upon the WMI namespace in a predefined way, such as constructing a complex WMI Query Language (WQL) command from a simple WMIC alias Get command. Thus, aliases act as friendly syntax intermediaries between you and the namespace. For example, when you run a simple WMIC command such as

Here is how to wmic to get PC Motherboard serial numbers, CPU and BIOS details:


C:\> wmic bios get name,serialnumber,version


Above will print  name if your BIOS, current version and it’s serial number if there is any.

If you need to get more info about the specific Motherboard installed on host:


C:\> wmic csproduct get name,identifyingnumber,uuid


This command will show motherboard modification and it’s UUID

If you want to quickly get what is Windows running hardware CPU clock speed

C:\> wmic cpu get name,CurrentClockSpeed,MaxClockSpeed


Also if you have turbo boost CPUs above command will help you find what’s the Max Clock Speed your system is capable of for the current hardware configuration.

If you do have dynamic clock speed running, then add this line, will refresh and monitor the Clock speed every 1 second.

C:\> wmic cpu get name,CurrentClockSpeed,MaxClockSpeed /every:1

Actually wmic is a great tool

Mount remote Linux SSHFS Filesystem harddisk on Windows Explorer SWISH SSHFS file mounter and a short evaluation on what is available to copy files to SSHFS from Windows PC

Monday, February 22nd, 2016


I'm forced to use Windows on my workbook and I found it really irritating, that I can't easily share files in a DropBox, Google Drive, MS OneDrive, Amazon Storage or other cloud-storage free remote service. etc.
I don't want to use DropBox like non self-hosted Data storage because I want to keep my data private and therefore the only and best option for me was to make possible share my Linux harddisk storage
dir remotely to the Windows notebook.

I didn't wanted to setup some complex environment such as Samba Share Server (which used to be often a common option to share file from Linux server to Windows), neither wanted to bother with  installing FTP service and bother with FTP clients, or configuring some other complex stuff such as WebDav – which BTW is an accepted and heavily used solution across corporate clients to access read / write files on a remote Linux servers.
Hence, I made a quick research what else besides could be used to easily share files and data from Windows PC / notebook to a home brew or professional hosting Linux server.

It turned out, there are few of softwares that gives a similar possibility for a home lan small network Linux / Windows hybrid network users such, here is few of the many:

  • SyncThingSyncthing is an open-source file synchronization client/server application, written in Go, implementing its own, equally free Block Exchange Protocol. The source code's content-management repository is hosted on GitHub







  • OwnCloud – ownCloud provides universal access to your files via the web, your computer or your mobile devices







  • Seafile – Seafile is a file hosting software system. Files are stored on a central server and can be synchronized with personal computers and mobile devices via the Seafile client. Files can also be accessed via the server's web interface


I've checked all of them and give a quick try of Syncthing which is really easy to start, just download the binary launch it and configure it under https://Localhost:8385 URL from a browser on the Linux server.
Syncthing seemed to be nice and easy to configure solution to be able to Sync files between Server A (Windows) and Server B(Linux) and guess many would enjoy it, if you want to give it a try you can follow this short install syncthing article.
However what I didsliked in both SyncThing and OwnCloud and Seafile and all of the other Sync file solutions was, they only supported synchronization via web and didn't seemed to have a Windows Explorer integration and did required
the server to run more services, posing another security hole in the system as such third party softwares are not easily to update and maintain.

Because of that finally after rethinking about some other ways to copy files to a locally mounted Sync directory from the Linux server, I've decided to give SSHFS a try. Mounting SSHFS between two Linux / UNIX hosts is
quite easy task with SSHFS tool

In Windows however the only way I know to transfer files to Linux via SSHFS was with WinSCP client and other SCP clients as well as the experimental:

As well as few others such as ExpandDrive, Netdrive, Dokan SSHFS (mirrored for download here)
I should say that I first decided to try copying few dozen of Gigabyte movies, text, books etc. using WinSCP direct connection, but after getting a couple of timeouts I was tired of WinSCP and decided to look for better way to copy to remote Linux SSHFS.
However the best solution I found after a bit of extensive turned to be:

SWISH – Easy SFTP for Windows

Swish is very straight forward to configure compared to all of them you download the .exe which as of time of writting is at version 0.8.0 install on the PC and right in My Computer you will get a New Device called Swish next to your local and remote drives C:/ D:/ , USBs etc.


As you see in below screenshot two new non-standard buttons will Appear in Windows Explorer that lets you configure SWISH


Next and final step before you have the SSHFS remote Linux filesystem visible on Windows Xp / 7 / 8 / 10 is to fill in remote Linux hostname address (or even better fill in IP to get rid of possible DNS issues), UserName (UserID) and Direcory to mount.


Then you will see the SSHFS moutned:


You will be asked to accept the SSH host-key as it used to be unknown so far


That's it now you will see straight into Windows Explorer the remote Linux SSHFS mounted:


Once setupped a Swish connection to copy files directly to it you can use the Send to Embedded Windows dialog, as in below screenshot


The only 3 problem with SWISH are:

1. It doesn't support Save password, so on every Windows PC reboot when you want to connect to remote Linux SSHFS, you will have to retype remote login user pass.
Fron security stand point this is not such a bad thing, but it is a bit irritating to everytime type the password without an option to save permanently.
The good thing here is you can use Launch Key Agent
as visible in above screenshot and set in Putty Key Agent your remote host SSH key so the passwordless login will work without any authentication at
all however, this might open a security hole if your Win PC gets infected by virus, which might delete something on remote mounted SSHFS filesystem so I personally prefer to retype password on every boot.

2. it is a bit slow so if you're planning to Transfer large amounts of Data as hundreds of megabytes, expect a very slow transfer rate, even in a Local  10Mbit Network to transfer 20 – 30 GB of data, it took me about 2-3 hours or so.
SWISH is not actively supported and it doesn't have new release since 20th of June 2013, but for the general work I need it is just perfect, as I don't tent to be Synchronizing Dozens of Gigabytes all the time between my notebook PC and the Linux server.

3. If you don't use the established mounted connection for a while or your computer goes to sleep mode after recovering your connection to remote Linux HDD if opened in Windows File Explorer will probably be dead and you will have to re-enable it.

For Mac OS X users who want to mount / attach remote directory from a Linux partitions should look in fuguA Mac OS X SFTP, SCP and SSH Frontend

I'll be glad to hear from people on other good ways to achieve same results as with SWISH but have a better copy speed while using SSHFS.

Where is Firefox plugin / bookmarks / temp files directory on Windows?

Saturday, February 13th, 2016


If you want to find out where Firefox downloads and keeps installed extensions in a quick manner, just press together:
KBD Windows flag button + R

This shortcut will open WIndows Run prompt
And paste inside the run prompt



The %appdata% is Windows internal variable that keeps inside  path to C:\Users\Your-Username\AppData\Roaming

On my workPC this contains:

C:\Users\georgi7>echo %appdata%



Enjoy 🙂

