Posts Tagged ‘client’

Block Web server over loading Bad Crawler Bots and Search Engine Spiders with .htaccess rules

Monday, September 18th, 2017

howto-block-webserver-overloading-bad-crawler-bots-spiders-with-htaccess-modrewrite-rules-file

In last post, I've talked about the problem of Search Index Crawler Robots aggressively crawling websites and how to stop them (the article is here) explaning how to raise delays between Bot URL requests to website and how to completely probhit some bots from crawling with robots.txt.

As explained in article the consequence of too many badly written or agressive behaviour Spider is the "server stoning" and therefore degraded Web Server performance as a cause or even a short time Denial of Service Attack, depending on how well was the initial Server Scaling done.

The bots we want to filter are not to be confused with the legitimate bots, that drives real traffic to your website, just for information

 The 10 Most Popular WebCrawlers Bots as of time of writting are:
 

1. GoogleBot (The Google Crawler bots, funnily bots become less active on Saturday and Sundays :))

2. BingBot (Bing.com Crawler bots)

3. SlurpBot (also famous as Yahoo! Slurp)

4. DuckDuckBot (The dutch search engine duckduckgo.com crawler bots)

5. Baiduspider (The Chineese most famous search engine used as a substitute of Google in China)

6. YandexBot (Russian Yandex Search engine crawler bots used in Russia as a substitute for Google )

7. Sogou Spider (leading Chineese Search Engine launched in 2004)

8. Exabot (A French Search Engine, launched in 2000, crawler for ExaLead Search Engine)

9. FaceBot (Facebook External hit, this crawler is crawling a certain webpage only once the user shares or paste link with video, music, blog whatever  in chat to another user)

10. Alexa Crawler (la_archiver is a web crawler for Amazon's Alexa Internet Rankings, Alexa is a great site to evaluate the approximate page popularity on the internet, Alexa SiteInfo page has historically been the Swift Army knife for anyone wanting to quickly evaluate a webpage approx. ranking while compared to other pages)

Above legitimate bots are known to follow most if not all of W3C – World Wide Web Consorium (W3.Org) standards and therefore, they respect the content commands for allowance or restrictions on a single site as given from robots.txt but unfortunately many of the so called Bad-Bots or Mirroring scripts that are burning your Web Server CPU and Memory mentioned in previous article are either not following /robots.txt prescriptions completely or partially.

Hence with the robots.txt unrespective bots, the case the only way to get rid of most of the webspiders that are just loading your bandwidth and server hardware is to filter / block them is by using Apache's mod_rewrite through

 

.htaccess


file

Create if not existing in the DocumentRoot of your website .htaccess file with whatever text editor, or create it your windows / mac os desktop and transfer via FTP / SecureFTP to server.

I prefer to do it directly on server with vim (text editor)

 

 

vim /var/www/sites/your-domain.com/.htaccess

 

RewriteEngine On

IndexIgnore .htaccess */.??* *~ *# */HEADER* */README* */_vti*

SetEnvIfNoCase User-Agent "^Black Hole” bad_bot
SetEnvIfNoCase User-Agent "^Titan bad_bot
SetEnvIfNoCase User-Agent "^WebStripper" bad_bot
SetEnvIfNoCase User-Agent "^NetMechanic" bad_bot
SetEnvIfNoCase User-Agent "^CherryPicker" bad_bot
SetEnvIfNoCase User-Agent "^EmailCollector" bad_bot
SetEnvIfNoCase User-Agent "^EmailSiphon" bad_bot
SetEnvIfNoCase User-Agent "^WebBandit" bad_bot
SetEnvIfNoCase User-Agent "^EmailWolf" bad_bot
SetEnvIfNoCase User-Agent "^ExtractorPro" bad_bot
SetEnvIfNoCase User-Agent "^CopyRightCheck" bad_bot
SetEnvIfNoCase User-Agent "^Crescent" bad_bot
SetEnvIfNoCase User-Agent "^Wget" bad_bot
SetEnvIfNoCase User-Agent "^SiteSnagger" bad_bot
SetEnvIfNoCase User-Agent "^ProWebWalker" bad_bot
SetEnvIfNoCase User-Agent "^CheeseBot" bad_bot
SetEnvIfNoCase User-Agent "^Teleport" bad_bot
SetEnvIfNoCase User-Agent "^TeleportPro" bad_bot
SetEnvIfNoCase User-Agent "^MIIxpc" bad_bot
SetEnvIfNoCase User-Agent "^Telesoft" bad_bot
SetEnvIfNoCase User-Agent "^Website Quester" bad_bot
SetEnvIfNoCase User-Agent "^WebZip" bad_bot
SetEnvIfNoCase User-Agent "^moget/2.1" bad_bot
SetEnvIfNoCase User-Agent "^WebZip/4.0" bad_bot
SetEnvIfNoCase User-Agent "^WebSauger" bad_bot
SetEnvIfNoCase User-Agent "^WebCopier" bad_bot
SetEnvIfNoCase User-Agent "^NetAnts" bad_bot
SetEnvIfNoCase User-Agent "^Mister PiX" bad_bot
SetEnvIfNoCase User-Agent "^WebAuto" bad_bot
SetEnvIfNoCase User-Agent "^TheNomad" bad_bot
SetEnvIfNoCase User-Agent "^WWW-Collector-E" bad_bot
SetEnvIfNoCase User-Agent "^RMA" bad_bot
SetEnvIfNoCase User-Agent "^libWeb/clsHTTP" bad_bot
SetEnvIfNoCase User-Agent "^asterias" bad_bot
SetEnvIfNoCase User-Agent "^httplib" bad_bot
SetEnvIfNoCase User-Agent "^turingos" bad_bot
SetEnvIfNoCase User-Agent "^spanner" bad_bot
SetEnvIfNoCase User-Agent "^InfoNaviRobot" bad_bot
SetEnvIfNoCase User-Agent "^Harvest/1.5" bad_bot
SetEnvIfNoCase User-Agent "Bullseye/1.0" bad_bot
SetEnvIfNoCase User-Agent "^Mozilla/4.0 (compatible; BullsEye; Windows 95)" bad_bot
SetEnvIfNoCase User-Agent "^Crescent Internet ToolPak HTTP OLE Control v.1.0" bad_bot
SetEnvIfNoCase User-Agent "^CherryPickerSE/1.0" bad_bot
SetEnvIfNoCase User-Agent "^CherryPicker /1.0" bad_bot
SetEnvIfNoCase User-Agent "^WebBandit/3.50" bad_bot
SetEnvIfNoCase User-Agent "^NICErsPRO" bad_bot
SetEnvIfNoCase User-Agent "^Microsoft URL Control – 5.01.4511" bad_bot
SetEnvIfNoCase User-Agent "^DittoSpyder" bad_bot
SetEnvIfNoCase User-Agent "^Foobot" bad_bot
SetEnvIfNoCase User-Agent "^WebmasterWorldForumBot" bad_bot
SetEnvIfNoCase User-Agent "^SpankBot" bad_bot
SetEnvIfNoCase User-Agent "^BotALot" bad_bot
SetEnvIfNoCase User-Agent "^lwp-trivial/1.34" bad_bot
SetEnvIfNoCase User-Agent "^lwp-trivial" bad_bot
SetEnvIfNoCase User-Agent "^Wget/1.6" bad_bot
SetEnvIfNoCase User-Agent "^BunnySlippers" bad_bot
SetEnvIfNoCase User-Agent "^Microsoft URL Control – 6.00.8169" bad_bot
SetEnvIfNoCase User-Agent "^URLy Warning" bad_bot
SetEnvIfNoCase User-Agent "^Wget/1.5.3" bad_bot
SetEnvIfNoCase User-Agent "^LinkWalker" bad_bot
SetEnvIfNoCase User-Agent "^cosmos" bad_bot
SetEnvIfNoCase User-Agent "^moget" bad_bot
SetEnvIfNoCase User-Agent "^hloader" bad_bot
SetEnvIfNoCase User-Agent "^humanlinks" bad_bot
SetEnvIfNoCase User-Agent "^LinkextractorPro" bad_bot
SetEnvIfNoCase User-Agent "^Offline Explorer" bad_bot
SetEnvIfNoCase User-Agent "^Mata Hari" bad_bot
SetEnvIfNoCase User-Agent "^LexiBot" bad_bot
SetEnvIfNoCase User-Agent "^Web Image Collector" bad_bot
SetEnvIfNoCase User-Agent "^The Intraformant" bad_bot
SetEnvIfNoCase User-Agent "^True_Robot/1.0" bad_bot
SetEnvIfNoCase User-Agent "^True_Robot" bad_bot
SetEnvIfNoCase User-Agent "^BlowFish/1.0" bad_bot
SetEnvIfNoCase User-Agent "^JennyBot" bad_bot
SetEnvIfNoCase User-Agent "^MIIxpc/4.2" bad_bot
SetEnvIfNoCase User-Agent "^BuiltBotTough" bad_bot
SetEnvIfNoCase User-Agent "^ProPowerBot/2.14" bad_bot
SetEnvIfNoCase User-Agent "^BackDoorBot/1.0" bad_bot
SetEnvIfNoCase User-Agent "^toCrawl/UrlDispatcher" bad_bot
SetEnvIfNoCase User-Agent "^WebEnhancer" bad_bot
SetEnvIfNoCase User-Agent "^TightTwatBot" bad_bot
SetEnvIfNoCase User-Agent "^suzuran" bad_bot
SetEnvIfNoCase User-Agent "^VCI WebViewer VCI WebViewer Win32" bad_bot
SetEnvIfNoCase User-Agent "^VCI" bad_bot
SetEnvIfNoCase User-Agent "^Szukacz/1.4" bad_bot
SetEnvIfNoCase User-Agent "^QueryN Metasearch" bad_bot
SetEnvIfNoCase User-Agent "^Openfind data gathere" bad_bot
SetEnvIfNoCase User-Agent "^Openfind" bad_bot
SetEnvIfNoCase User-Agent "^Xenu’s Link Sleuth 1.1c" bad_bot
SetEnvIfNoCase User-Agent "^Xenu’s" bad_bot
SetEnvIfNoCase User-Agent "^Zeus" bad_bot
SetEnvIfNoCase User-Agent "^RepoMonkey Bait & Tackle/v1.01" bad_bot
SetEnvIfNoCase User-Agent "^RepoMonkey" bad_bot
SetEnvIfNoCase User-Agent "^Zeus 32297 Webster Pro V2.9 Win32" bad_bot
SetEnvIfNoCase User-Agent "^Webster Pro" bad_bot
SetEnvIfNoCase User-Agent "^EroCrawler" bad_bot
SetEnvIfNoCase User-Agent "^LinkScan/8.1a Unix" bad_bot
SetEnvIfNoCase User-Agent "^Keyword Density/0.9" bad_bot
SetEnvIfNoCase User-Agent "^Kenjin Spider" bad_bot
SetEnvIfNoCase User-Agent "^Cegbfeieh" bad_bot

 

<Limit GET POST>
order allow,deny
allow from all
Deny from env=bad_bot
</Limit>

 


Above rules are Bad bots prohibition rules have RewriteEngine On directive included however for many websites this directive is enabled directly into VirtualHost section for domain/s, if that is your case you might also remove RewriteEngine on from .htaccess and still the prohibition rules of bad bots should continue to work
Above rules are also perfectly suitable wordpress based websites / blogs in case you need to filter out obstructive spiders even though the rules would work on any website domain with mod_rewrite enabled.

Once you have implemented above rules, you will not need to restart Apache, as .htaccess will be read dynamically by each client request to Webserver

2. Testing .htaccess Bad Bots Filtering Works as Expected


In order to test the new Bad Bot filtering configuration is working properly, you have a manual and more complicated way with lynx (text browser), assuming you have shell access to a Linux / BSD / *Nix computer, or you have your own *NIX server / desktop computer running
 

Here is how:
 

 

lynx -useragent="Mozilla/5.0 (compatible; MegaIndex.ru/2.0; +http://megaindex.com/crawler)" -head -dump http://www.your-website-filtering-bad-bots.com/

 

 

Note that lynx will provide a warning such as:

Warning: User-Agent string does not contain "Lynx" or "L_y_n_x"!

Just ignore it and press enter to continue.

Two other use cases with lynx, that I historically used heavily is to pretent with Lynx, you're GoogleBot in order to see how does Google actually see your website?
 

  • Pretend with Lynx You're GoogleBot

 

lynx -useragent="Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)" -head -dump http://www.your-domain.com/

 

 

  • How to Pretend with Lynx Browser You are GoogleBot-Mobile

 

lynx -useragent="Mozilla/5.0 (iPhone; U; CPU iPhone OS 4_1 like Mac OS X; en-us) AppleWebKit/532.9 (KHTML, like Gecko) Version/4.0.5 Mobile/8B117 Safari/6531.22.7 (compatible; Googlebot-Mobile/2.1; +http://www.google.com/bot.html)" -head -dump http://www.your-domain.com/

 


Or for the lazy ones that doesn't have Linux / *Nix at disposal you can use WannaBrowser website

Wannabrowseris a web based browser emulator which gives you the ability to change the User-Agent on each website req1uest, so just set your UserAgent to any bot browser that we just filtered for example set User-Agent to CheeseBot

The .htaccess rule earier added once detecting your browser client is coming in with the prohibit browser agent will immediately filter out and you'll be unable to access the website with a message like:
 

HTTP/1.1 403 Forbidden

 

Just as I've talked a lot about Index Bots, I think it is worthy to also mention three great websites that can give you a lot of Up to Date information on exact Spiders returned user-agent, common known Bot traits as well as a a current updated list with the Bad Bots etc.

Bot and Browser Resources information user-agents, bad-bots and odd Crawlers and Bots specifics

1. botreports.com
2. user-agents.org
3. useragentapi.com

 

An updated list with robots user-agents (crawler-user-agents) is also available in github here regularly updated by Caia Almeido

There are also a third party plugin (modules) available for Website Platforms like WordPress / Joomla / Typo3 etc.

Besides the listed on these websites as well as the known Bad and Good Bots, there are perhaps a hundred of others that might end up crawling your webdsite that might or might not need  to be filtered, therefore before proceeding with any filtering steps, it is generally a good idea to monitor your  HTTPD access.log / error.log, as if you happen to somehow mistakenly filter the wrong bot this might be a reason for Website Indexing Problems.

Hope this article give you some valueable information. Enjoy ! 🙂

 

Share this on

How to use wget and curl via HTTP Proxy server / How to set a HTTPS proxy server on a bash shell on Linux

Wednesday, January 27th, 2016

linux-ssl-proxy-configuration-from-command-line-with-wget-and-curl-howto

I've been working a bit on a client's automation, the task is to automate process of installations of Apaches / Tomcats / JBoss and Java servers, so me and colleagues don't waste too
much time in trivial things. To complete that I've created a small repository on a Apache with a WebDav server with major versions of each general branch of Application servers and Javas.
In order to access the remote URL where the .tar.gz binaries archives reside, I had to use a proxy serve as the client runs all his network in a DMZ and all Web Port 80 and 443 HTTPS traffic inside the client network
has to pass by the network proxy.

Thus to make the downloads possible via the shell script, writting I needed to set the script to use the HTTPS proxy server. I've been using proxy earlier and I was pretty aware of the http_proxy bash shell
variable thus I tried to use this one for the Secured HTTPS proxy, however the connection was failing and thanks to colleague Anatoliy I realized the whole problem is I'm trying to use http_proxy shell variable
which has to only be used for unencrypted Proxy servers and in this case the proxy server is over SSL encrypted HTTPS protocol so instead the right variable to use is:
 

https_proxy


The https_proxy var syntax, goes like this:

proxy_url='http-proxy-url.net:8080';
export https_proxy="$proxy_url"

how-to-set-https_proxy_url-on-linux-freebsd-openbsd-bsd-and-unix-from-terminal-console

Once the https_proxy variable is set  UNIX's wget non interactive download tool starts using the proxy_url variable set proxy and the downloads in my script works.

Hence to make the different version application archives download work out, I've used wget like so:
 

 wget –no-check-certificate –timeout=5 https://full-path-to-url.net/file.rar


For other BSD / HP-UX / SunOS UNIX Servers where  shells are different from Bourne Again (Bash) Shell, the http_proxy and  https_proxy variable might not be working.
In such cases if you have curl (command line tool) is available instead of wget to script downloads you can use something like:
 

 curl -O -1 -k –proxy http-proxy-url.net:8080 https://full-path-to-url.net/file.rar

The http_proxy and https_proxy variables works perfect also on Mac OS X, default bash shell, so Mac users enjoy.
For some bash users in some kind of firewall hardened environments like in my case, its handy to permanently set a proxy to all shell activities via auto login Linux / *unix scripts .bashrc or .bash_profile that saves the inconvenience to always
set the proxy so lynx and links, elinks text console browsers does work also anytime you login to shell.

Well that's it, my script enjoys proxying traffic 🙂
 

Share this on

Fix MySQL ibdata file size – ibdata1 file growing too large, preventing ibdata1 from eating all your server disk space

Thursday, April 2nd, 2015

fix-solve-mysql-ibdata-file-size-ibdata1-file-growing-too-large-and-preventing-ibdata1-from-eating-all-your-disk-space-innodb-vs-myisam

If you're a webhosting company hosting dozens of various websites that use MySQL with InnoDB  engine as a backend you've probably already experienced the annoying problem of MySQL's ibdata1 growing too large / eating all server's disk space and triggering disk space low alerts. The ibdata1 file, taking up hundreds of gigabytes is likely to be encountered on virtually all Linux distributions which run default MySQL server <= MySQL 5.6 (with default distro shipped my.cnf). The excremental ibdata1 raise appears usually due to a application software bug on how it queries the database. In theory there are no limitation for ibdata1 except maximum file size limitation set for the filesystem (and there is no limitation option set in my.cnf) meaning it is quite possible that under certain conditions ibdata1 grow over time can happily fill up your server LVM (Storage) drive partitions.

Unfortunately there is no way to shrink the ibdata1 file and only known work around (I found) is to set innodb_file_per_table option in my.cnf to force the MySQL server create separate *.ibd files under datadir (my.cnf variable) for each freshly created InnoDB table.
 

1. Checking size of ibdata1 file

On Debian / Ubuntu and other deb based Linux servers datadir is /var/lib/mysql/ibdata1

server:~# du -hsc /var/lib/mysql/ibdata1
45G     /var/lib/mysql/ibdata1
45G     total


2. Checking info about Databases and Innodb storage Engine

server:~# mysql -u root -p
password:

mysql> SHOW DATABASES;
+——————–+
| Database           |
+——————–+
| information_schema |
| bible              |
| blog               |
| blog-sezoni        |
| blogmonastery      |
| daniel             |
| ezmlm              |
| flash-games        |


Next step is to get some understanding about how many existing InnoDB tables are present within Database server:

 

mysql> SELECT COUNT(1) EngineCount,engine FROM information_schema.tables WHERE table_schema NOT IN ('information_schema','performance_schema','mysql') GROUP BY engine;
+————-+——–+
| EngineCount | engine |
+————-+——–+
|         131 | InnoDB |
|           5 | MEMORY |
|         584 | MyISAM |
+————-+——–+
3 rows in set (0.02 sec)

To get some more statistics related to InnoDb variables set on the SQL server:
 

mysqladmin -u root -p'Your-Server-Password' var | grep innodb


Here is also how to find which tables use InnoDb Engine

mysql> SELECT table_schema, table_name
    -> FROM INFORMATION_SCHEMA.TABLES
    -> WHERE engine = 'innodb';

+————–+————————–+
| table_schema | table_name               |
+————–+————————–+
| blog         | wp_blc_filters           |
| blog         | wp_blc_instances         |
| blog         | wp_blc_links             |
| blog         | wp_blc_synch             |
| blog         | wp_likes                 |
| blog         | wp_wpx_logs              |
| blog-sezoni  | wp_likes                 |
| icanga_web   | cronk                    |
| icanga_web   | cronk_category           |
| icanga_web   | cronk_category_cronk     |
| icanga_web   | cronk_principal_category |
| icanga_web   | cronk_principal_cronk    |


3. Check and Stop any Web / Mail / DNS service using MySQL

server:~# ps -efl |grep -E 'apache|nginx|dovecot|bind|radius|postfix'

Below cmd should return empty output, (e.g. Apache / Nginx / Postfix / Radius / Dovecot / DNS etc. services are properly stopped on server).

4. Create Backup dump all MySQL tables with mysqldump

Next step is to create full backup dump of all current MySQL databases (with mysqladmin):

server:~# mysqldump –opt –allow-keywords –add-drop-table –all-databases –events -u root -p > dump.sql
server:~# du -hsc /root/dump.sql
940M    dump.sql
940M    total

 

If you have free space on an external backup server or remotely mounted attached (NFS or SAN Storage) it is a good idea to make a full binary copy of MySQL data (just in case something wents wrong with above binary dump), copy respective directory depending on the Linux distro and install location of SQL binary files set (in my.cnf).
To check where are MySQL binary stored database data (check in my.cnf):

server:~# grep -i datadir /etc/mysql/my.cnf
datadir         = /var/lib/mysql

If server is CentOS / RHEL Fedora RPM based substitute in above grep cmd line /etc/mysql/my.cnf with /etc/my.cnf

if you're on Debian / Ubuntu:

server:~# /etc/init.d/mysql stop
server:~# cp -rpfv /var/lib/mysql /root/mysql-data-backup

Once above copy completes, DROP all all databases except, mysql, information_schema (which store MySQL existing user / passwords and Access Grants and Host Permissions)

5. Drop All databases except mysql and information_schema

server:~# mysql -u root -p
password:

 

mysql> SHOW DATABASES;

DROP DATABASE blog;
DROP DATABASE sessions;
DROP DATABASE wordpress;
DROP DATABASE micropcfreak;
DROP DATABASE statusnet;

          etc. etc.

ACHTUNG !!! DON'T execute!DROP database mysql; DROP database information_schema; !!! – cause this might damage your User permissions to databases

6. Stop MySQL server and add innodb_file_per_table and few more settings to prevent ibdata1 to grow infinitely in future

server:~# /etc/init.d/mysql stop

server:~# vim /etc/mysql/my.cnf
[mysqld]
innodb_file_per_table
innodb_flush_method=O_DIRECT
innodb_log_file_size=1G
innodb_buffer_pool_size=4G

Delete files taking up too much space – ibdata1 ib_logfile0 and ib_logfile1

server:~# cd /var/lib/mysql/
server:~#  rm -f ibdata1 ib_logfile0 ib_logfile1
server:~# /etc/init.d/mysql start
server:~# /etc/init.d/mysql stop
server:~# /etc/init.d/mysql start
server:~# ps ax |grep -i mysql

 

You should get no running MySQL instance (processes), so above ps command should return blank.
 

7. Re-Import previously dumped SQL databases with mysql cli client

server:~# cd /root/
server:~# mysql -u root -p < dump.sql

Hopefully import should went fine, and if no errors experienced new data should be in.

Altearnatively if your database is too big and you want to import it in less time to mitigate SQL downtime, instead import the database with:

server:~# mysql -u root -p
password:
mysql>  SET FOREIGN_KEY_CHECKS=0;
mysql> SOURCE /root/dump.sql;
mysql> SET FOREIGN_KEY_CHECKS=1;

 

If something goes wrong with the import for some reason, you can always copy over sql binary files from /root/mysql-data-backup/ to /var/lib/mysql/
 

8. Connect to mysql and check whether databases are listable and re-check ibdata file size

Once imported login with mysql cli and check whther databases are there with:

server:~# mysql -u root -p
SHOW DATABASES;

Next lets see what is currently the size of ibdata1, ib_logfile0 and ib_logfile1
 

server:~# du -hsc /var/lib/mysql/{ibdata1,ib_logfile0,ib_logfile1}
19M     /var/lib/mysql/ibdata1
1,1G    /var/lib/mysql/ib_logfile0
1,1G    /var/lib/mysql/ib_logfile1
2,1G    total

Now ibdata1 will grow, but only contain table metadata. Each InnoDB table will exist outside of ibdata1.
To better understand what I mean, lets say you have InnoDB table named blogdb.mytable.
If you go into /var/lib/mysql/blogdb, you will see two files
representing the table:

  •     mytable.frm (Storage Engine Header)
  •     mytable.ibd (Home of Table Data and Table Indexes for blogdb.mytable)

Now construction will be like that for each of MySQL stored databases instead of everything to go to ibdata1.
MySQL 5.6+ admins could relax as innodb_file_per_table is enabled by default in newer SQL releases.


Now to make sure your websites are working take few of the hosted websites URLs that use any of the imported databases and just browse.
In my case ibdata1 was 45GB after clearing it up I managed to save 43 GB of disk space!!!

Enjoy the disk saving! 🙂

Share this on

Apache Webserver disable file extension – Forbid / Deny access in Apache config to certain file extensions for a Virtualhost

Monday, March 30th, 2015


If you're a Webhosting company sysadmin like me and you already have configured directory listing for certain websites / Vhosts and those files are mirrored from other  development webserver location but some of the uploaded developer files extensions which are allowed to be interptered such as php include files .inc / .htaccess mod_rewrite rules / .phps / .html / .txt need to be working on the dev / test server but needs to be disabled (excluded) from delivery or interpretting for some directory on the prod server.

Open Separate host VirtualHost file or Apache config (httpd.conf / apache2.conf)  if all Vhosts  for which you want to disable certain file extensions and add inside:

<Directory "/var/www/sploits">
        AllowOverride All

</Directory>

 

Extension Deny Rules such as:

For disabling .inc files from inclusion from other PHP sources:
 

<Files  ~ "\.inc$">
  Order allow,deny
  Deny from all
</Files>


To Disable access to .htaccess single file only

 

<Files ~ "^\.htaccess">
  Order allow,deny
  Deny from all
</Files>


To Disable .txt from being served by Apache and delivered to requestor browser:

 

<Files  ~ "\.txt$">
  Order allow,deny
  Deny from all
</Files>

 


To Disable any left intact .html from being delivered to client:

 

<Files  ~ "\.html$">
  Order allow,deny
  Deny from all
</Files>

 

Do it for as many extensions as you need.
Finally to make changes affect restart Apache as usual:

If on Deb based Linux issue:

/etc/init.d/apach2 restart


On CentOS / RHEL and other Redhats / RedHacks 🙂

/etc/init.d/httpd restart

Share this on

Quick way to access remotely your GNU / Linux Desktop – Access Linux Desktop from Mac and Windows 7

Tuesday, August 5th, 2014

how-to-access-linux-host-from-microsoft-windows-or-mac-client-xrdp-tightvnc-native-way-logo
For M$ Windows users its always handy to have remote access to your home PC or notebook via Remote Desktop (RDP) protocol.

However in GNU / Linux, there is no native implementation of RDP protocol. So if you're using Linux as your Desktop like me you will probably want to be able to access the Linux system remotely not only via terminal with SSH using (Putty) or MobaXTerm all in one tabbed Windows terminal program but also be able to use your Linux GNOME / KDE Graphical environment from anywhere on the Internet.

This will make you ponder – Is it possible to access Linux Desktop via proprietary RDP protocol and if not how you can achieve remote GUI access to Linux?

1. Using Linux Xorg and Xming Xserver for Windows

Most people should already know of Linux ability to start multiple Xserver sessions remotely which is the native way to access between two Linux hosts or access remotely Linux from other Linux UNIX like OS. It is also possible to use xinit / startx / xhost commands to establish remotely connection to new or running Linux (Xorg) Xserver by using them in combination with XMing – XServer for Windows running on the Windows host and Debian package (x11-xserver-utils) – providing xhost cmd, however this method is a bit complicated and not so convenient.

I used to be using this method XMing (whose mirror is here), earlier in my university years to use remotely my Debian Linux from  Windows 98 and this works perfectly fine.

2. Using RDP emulation with XRDP server

in order to be able to access your desk from any friend or computer club in the world using standard available in MS Windows Remote Desktop client (mstsc.exe).
There is also another alternative way by using Windows Desktop sharing RDP experimental server xrdp:
 

apt-cache show xrdp |grep -i descr -A 3
Description: Remote Desktop Protocol (RDP) server
 Based on research work by the rdesktop project, xrdp uses the Remote
 Desktop Protocol to present a graphical login to a remote client.
 xrdp can connect to a VNC server or another RDP server.

To make your Linux host accessible via RDP:

On Debian / Ubuntu etc. deb based Linux:

 

apt-get update
apt-get install xrdp

 
$ /etc/init.d/xrdp status
Checking status of Remote Desktop Protocol server xrdp                                             [ OK ]
Checking status of RDP Session Manager sesman

/etc/init.d/xrdp start

On  Fedora Linux:
 

yum -y install xrdp
systemctl enable xrdp.service
systemctl start xrdp.service
systemctl enable xrdp-sesman.service
systemctl start xrdp-sesman.service


It is possible to access remote Linux host using xrdp RDP server, but this will only work in older releases of mstsc.exe (Windows XP / Vista / 2003) and will not work on Windows 7 / 8, because in MS Windows 7 and onwards RDP proto version has changed and the client no longer has compatability with older mstsc releases. There is a work around for this for anyone who stubbornly want to use RDP protocol to access Linux host. If you want to connect to xrdp from Windows 7 you have to copy the old RDP client (mstsc.exe and mstscax.dll) from a WinXP install to the Windows 7 box and run it independently, from the default installed ones, anyways this method is time consuming and not really worthy …

3. Using the VNC withTightVNC server / client

 

Taking above in consideration, for me personally best way to access Linux host from Windows and Mac is to use simply the good old VNC protocol with TightVNC.

TightVNC is cross-platform free and open source remote Desktop client it uses RFB protocol to control another computer screen remotely.

To use tightvnc to access remote Debian / Ubuntu – deb based Linux screen, tightvncserver package has to be installed:

apt-cache show tightvncserver|grep -i desc -A 7
Description-en: virtual network computing server software
 VNC stands for Virtual Network Computing. It is, in essence, a remote
 display system which allows you to view a computing `desktop' environment
 not only on the machine where it is running, but from anywhere on the
 Internet and from a wide variety of machine architectures.

 .
 This package provides a server to which X clients can connect and the
 server generates a display that can be viewed with a vncviewer.

 

apt-get –yes install tightvncserver


TightVNCserver package is also available in default repositories of Fedora / CentOS / RHEL and most other RPM based distros, to install there:
 

yum -y install tightvnc-server


Once it is installed to make tightvncserver running you have to start it (preferrably with non-root user), usually this is the user with which you're using the system:

tightvncserver

You will require a password to access your desktops.

Password:
Verify:   
Would you like to enter a view-only password (y/n)? n

New 'X' desktop is rublev:4

Creating default startup script /home/hipo/.vnc/xstartup
Starting applications specified in /home/hipo/.vnc/xstartup
Log file is /home/hipo/.vnc/rublev:4.log

 

tightvncserver-running-in-gnome-terminal-debian-gnu-linux-wheezy-screenshot

To access now TightVncserver on the Linux host Download and Install TightVNC Viewer client

note that you need to download TightVNC Java Viewer JAR in ZIP archive – don't install 32 / 64 bit installer for Windows, as this will install and setup TightVNCServer on your Windows – and you probably don't want that (and – yes you will need to have Oracle Java VM installed) …
 

tightvnc-viewer-java-client-running-on-microsoft-windows-7-screenshot

Once unzipped run tightvnc-jviewer.jar and type in the IP address of remote Linux host and screen, where TightVNC is listening, as you can see in prior screenshot my screen is :4, because I run tightvnc to listen for connections in multiple X sessions. once you're connected you will be prompted for password, asker earlier when you run  tightvncserver cmd on Linux host.

If you happen to be on a Windows PC without Java installed or Java use is prohibited you can use TightVNC Viewer Portable Binary (mirrored here)

/images/tightvnc-viewer-portable-windows-7-desktop-screenshot

If you have troubles with connection, on Linux host check the exact port on which TightVncServer is running:
 

ps ax |grep -i Tightvnc

 8630 pts/8    S      0:02 Xtightvnc :4 -desktop X -auth /var/run/gdm3/auth-for-hipo-7dpscj/database -geometry 1024×768 -depth 24 -rfbwait 120000 -rfbauth /home/hipo/.vnc/passwd -rfbport 5904 -fp /usr/share/fonts/X11/misc/,/usr/share/fonts/X11/Type1/,/usr/share/fonts/X11/75dpi/,/usr/share/fonts/X11/100dpi/ -co /etc/X11/rgb

Then to check, whether the machine you're trying to connect from doesn't have firewall rules preventing the connection use (telnet) – if installed on the Windows host:
 

telnet www.pc-ferak.net 5904
Trying 192.168.56.101…
Connected to 192.168.56.101.
Escape character is '^]'.
RFB 003.008

telnet> quit
Connection closed.

remote-connection-via-tightvnc-to-linux-host-from-windows-7-using-tightvnc-java-client-screenshot
 

Share this on

Mysql: How to disable single database without dropping or renaming it

Wednesday, January 22nd, 2014

mysql rename forbid disable database howto logo, how to disable single database without dropping it
A colleague of mine working on MySQL database asked me How it is possible to disable a MySQL database. He is in situation where the client has 2 databases and application and is not sure which of the two databases the application uses. Therefore the client asked one of the database is disabled and wait for few hours and see if something will break / stop working and in that way determine which of the two database is used by application.

My first guess was to backup both databases and drop one of them, then if it is the wrong one to restore from the SQL dump backup, however this wasn't acceptable solution. So second I though of RENAME of database to another one and then reverting the name, however as it is written in MySQL documentation RENAME database function was removed from MySQL (found to be dangerous) since version 5.1.23 onwards. Anyhow there is a quick hack to rename mysql database using a for loop shell script one below:

mysql -e "CREATE DATABASE \`new_database\`;"
for table in `mysql -B -N -e "SHOW TABLES;" old_database`
do
  mysql -e "RENAME TABLE \`old_database\`.\`$table\` to \`new_database\`.\`$table\`"
  done
  mysql -e "DROP DATABASE \`old_database\`;"

Other possible solution was to change permissions of Application used username, however this was also complicated from mysql cli, hence I thought of installing and using PHPMyAdmin to make modify of db user permissions easier but on this server there wasn't Apache installed and MySQL is behind a firewall and only accessible via java tomcat host.

Finally after some pondering what can be done I came with solution to request to disable mysql database using chmod in /var/lib/mysql/data/, i.e.:

sql-server:~# chmod 0 /var/lib/mysql/databasename

Where databasename is the same as the database is named listable via mysql cli.

After doing it that way with no need to restart MySQL server database stopped to appear in show databases; and client confirmed that disabled database is no longer needed so we proceeded dropping it.

Hope this little article will help someone out there. Cheers :

Share this on

Web and Middleware JBoss Training at Hewlett Packard – Intro to JBoss JAVA application server

Thursday, November 21st, 2013

jboss application server logo- serve java servlet pages on Linux and Windows

I and my TEAM Web and Middleware Implementation Team @Hewlett Packard are assigned an online training to follow on topic of JBoss Application server.It is my first online training of this kind where a number of people are streamed a video from a trainer who explains in real time concepts of JBossA Community Drive open source middleware (Application Server), since some time JBoss is known under a new name (WildFly).

Wildfly new name of jboss application java servlet server

In short what is JBoss? – It is an application server similar to Apache Tomcat  -an open source software implementation of the Java Servlet and JavaServer Pages technologies.

Apache Tomcat java servlet application server logo

In case you wonder about what is Middleware it is a buzzword well established in Corporate world referring to all kind of servers in the middle between Servers on pure OS and hardware Level and end client. Middleware includes all kind of Web and Application servers like Apache, JBoss, Tomcat, Adobe's WebLogic Webserver, IBM WebSphere application server etc..

What this means is JBOSS is very similar to Tomcat but it is designed to run interpret through (Java Virtual Machine), higher scale of Java Applications and then return content to a a web browser. In other words if you need to have a Webserver with support for Java VM. JBoss is one of the open source technologies available which can be a substitute for Tomcat. In Fact Jboss itself started as a fork of Tomcat and n owadays, Jboss has an implementation of Tomcat embedded into itself. Jboss is mainly developed and supported by Redhat. It has 3 major releases used in IT Companies. Jboss 5, JBoss 6 and JBoss 7. In most production server systems running some kind of Java servlets currently still Jboss ver. 5 and Jboss v. 6 is used. Just like Tomcat, the server is messy in its structure. But if we have to compare Tomcat with Jboss then JBoss is at least 100 times more messy and hard to configure tune than Tomcat. Actually after getting to know JBoss 6 I would not advice anyone to use this Application server. Its too complex and all configuration and performance tuning is done through hundred of XML so it is like a hell for the usual System Administrator who likes clearness and simplicity. JBoss has a Web configuration interface which in version 7 is a bit advanced and easier to configure and get to know the server compared to previous versions. But same web interface for older releases is lousy and not nice. Just like Tomcat, JBoss supports clustering, here is full list of all features it supports:

  • Full clustering support for both traditional J2EE applications and EJB 3.0 POJO applications
  • Automatic discovery. Nodes in cluster find each other with no additional configuration.
  • Cluster-wide replicated JNDI context
  • Failover and load-balancing for JNDI, RMI and all EJB types
  • Stateful Session Bean state replication
  • HTTP Session replication
  • High Availability JMS
  • Farming. Distributed deployment of JBoss components. Deploying on one node deploys on all nodes.
     

Looks like JBoss is among the few Application Servers supporting deployment of Java JSP, WAR Archive files, SAR Archives, JMS (Java Message Service), JNDI (Java Naming and Directory Interface). Jboss supports load balancing between clustered nodes, supports SOAP, Java servlet faces and Java MQ (Messaging Queue). JBoss can be installed on GNU / Linux, FreeBSD and Windows. So far from what I've learned for JBOSS I prefer not to use it and don't recommend this Application server to anyone. Its too complex and doesn't worth the effort to learn. Proprietary products like WebLogic and Webspehere are in light years better.

Share this on

How to make SSH tunnel with PuTTY terminal client

Monday, November 18th, 2013

Create-how to make ssh tunnel with Putty on microsoft windows Vista / 7 XP / 2000
Earlier I blogged how to create SSH tunnels on Linux. Another interesting thing is how to make SSH tunnels on Windows. This can be done with multiple SSH clients but probably quickest and most standard way is to do create SSH tunnel with Putty. So why would one want to make SSH tunnel to a Windows host? Lets say your remote server has a port filtered to the Internet but available to a local network to which you don't have direct access, the only way to access the port in question then is to create SSH tunnel between your computer and remote machine on some locally binded port (lets say you need to access port 80 on remote host and you will access it through localhost tunneled through 8080). Very common scenario where tunneling comes handy if you have a Tomcat server behind firewalled DMZ| / load balancer or Reverse Proxy. Usually on well secured networks direct access to Tomcat application server will be disabled to its listen port (lets say 11444). Another important great think of SSH tunnels is all information between Remote server and local PC are transferred in strong SSH crypted form so this adds extra security level to your communication.
Once "real life" case of SSH tunnel is whether you have to deploy an application which fails after deployment with no meaningful message but error is returned by Apache Reverse Proxy. To test directly tomcat best thing is to create SSH tunnel between remote host 11444 and local host through 11444 (or any other port of choice). Other useful case would be if you have to access directly via CLI interface an SQL server lets say MySQL (remote port 3306 filtered) and inaccessible with mysql cli or Oracle DB with Db listener on port 1521 (needed to accessed via sqlplus).

In that case Putty's Tunneling capabilities comes handy especially if you don't have a Linux box at hand.
To create new SSH tunnel in putty to MySQL port 3306 on localhost (3306) – be sure MySQL is not running on localhost 😉
Open Putty Navigate in left pane config bar to:

SSH -> Tunnels

Type in

Source Port

– port on which SSH tunnel will be binded on your Windows (localhost / 127.0.0.1) in this example case 3306.

Then for

Destination
– IP address or host of remote host with number of port to which SSH tunnel will be opened.

N.B. ! in order to make tunneling possible you will need to have opened access to SSH port of remote (Destination) host

make ssh tunnel on Microsoft Windows putty to remote filtered mysql shot

make ssh tunnels on Microsoft windows putty to remote filtered mysql 2 screenshot

open ssh tunnel via WINDOWS port 22 on microsoft windows 7 screenshot

Once click Open you will be prompted for username on remote host in my case to my local router 83.228.93.76. Once you login to remote host open command prompt and try to connect Windows Command prompt Start -> Run (cmd.exe) ;

C:\Users\\hipo> telnet localhost 3306

Connection should be succesful and you from there on assuming you have the MySQL cli version for windows installed you can use to login to remote SQL via SSH tunnel with;

C:\Users\\hipo> mysql -u root -h localhost -p

To later remove existing SSH Tunnel go again to SSH -> Tunnels press on SSH tunnel and choose Remove

Further you can craete multiple SSH tunnels for all services to remote host where access is needed. Important think to remember when creating multiple SSH connections is source port on localhost to remote machine should be unique

Share this on

Raspberry Pi – Cheap portable credit-card sized single board Linux computer box

Thursday, November 7th, 2013

RaspberryPi tiny-computer running Linux and free software Logo

Not of a the latest thing out there but I believe a must know for every geek is existence of Raspberry Pi mini computer Linux board. It is a geek credit-card sized mini PC on extremely cheap price between 25$ and 35$ bucks (e.g. Raspberry Pi model A and Raspberry Pi Model B).

Raspberry Pi hardware you get for this ultra low price is as follows:

  • Broadcom BCM2835 system on chip
  • ARM Mobile processor model ARM1176JZF running at 700 Mhz (overlocking up to 1Ghz is possible – overclocked RP is called Turbo 🙂 )!
  • VideoCore IV GPU with 512 MBytes of ram
  • No Build hard disk or solid-state drive but instead designed to use SD-Card as a Storage
  • two video outputs
  • composite RCA and an HDMI port
  • 3.5mm audio output
  • 2 or 1SD/MMC/SDIO card slot (depending on device model A or model B)
  • Micro USB adapter power charger 500mA  (2.5 watts) – Model and 700mA (3.5 watts)

Raspberry PI mini computer hardware running Linux explained picture

The idea of whole device is to make cheap affordable device for pupils and people from third countries who can't afford to pay big money for a full-featured computer. Achievement is unique all you need to Raspberry Pi credit card sized device is external keyboard a mouse, SD-card and a monitor, this makes a 700Mhz featured almost fully functional computer for less than lets say 80$ whether used with a second hand monitor / mouse and kbd :). A fully functional computer or full functional thin client for as less as 80$ yes that's what RaspberryPi is!

It is recommendable that SD-Card storage on which it is installed is at least 4GB as this is part of its minimum requirement, however it is best if you can get an SD-Card of 32GBytes whether you plan to use its whole graphic functionalities.

Raspberry Pi Hardware is not too powerful to run a version of Windows as well as there is no free version of MS-Windows for ARM Processor, so basicly device is planned to run free software OSes GNU / Linux. 5 operating systems are working fine with the mini-board device as time of writting;
 

  • Raspbian – Debian "Wheezy" Linux port
  • Pidora – Fedora mixed version ported to run on Raspberry Pi
  • Risk OS port
  • Arch Linux port for ARM devices
  • Slackware Arm
    FreeBSD / NetBSD
  • QtonPi

Recommended and probably best distro port is for Debian Squeeze

To boot an OS into raspberry PI dowbnload respective image from raspberrypi.org

– Use application for copying and extracting image to SD-Card like Win32 Disk Imager – whether on Windows platform

Win32DiskImager burning raspberry PI mini Linux card board computer box image

– Or from Linux format SD-Card with gparted (N!B! format disk to be in FAT32 filesystem), extrat files and copy them to SD-CARD.

Once Raspberry Pi loads up it will drop you into Linux console, so further configuration will have to be done manually with invoking plenty of apt-get commands (which I will not talk about here as there are plenty of manuals already) – you will have to manually install your Desktop … Default shipped Web browser in Debian is Midori and due to lack of ported version of flash player for ARM streaming video websites like youtube.com / vimeo.com does not work in browser. There is a Google Chrome for Raspberry Pi port but just like with Midori heavy object loaded websites works very slow and thus not very suitable for multimedia.

raspberry pi cheaest portable linux powered computer sized of a credit card

Raspberry Pi device is very suitable for ThinClient use there is a special separate project – Raspberry ThinClient Project – using which a hobbyist can save 400$ for buying proprietary ThinClient.

RaspberryPI linux as a free software hardware thinclient picture

 

Share this on