Posts Tagged ‘browser’

Block Web server over loading Bad Crawler Bots and Search Engine Spiders with .htaccess rules

Monday, September 18th, 2017

howto-block-webserver-overloading-bad-crawler-bots-spiders-with-htaccess-modrewrite-rules-file

In last post, I've talked about the problem of Search Index Crawler Robots aggressively crawling websites and how to stop them (the article is here) explaning how to raise delays between Bot URL requests to website and how to completely probhit some bots from crawling with robots.txt.

As explained in article the consequence of too many badly written or agressive behaviour Spider is the "server stoning" and therefore degraded Web Server performance as a cause or even a short time Denial of Service Attack, depending on how well was the initial Server Scaling done.

The bots we want to filter are not to be confused with the legitimate bots, that drives real traffic to your website, just for information

 The 10 Most Popular WebCrawlers Bots as of time of writting are:
 

1. GoogleBot (The Google Crawler bots, funnily bots become less active on Saturday and Sundays :))

2. BingBot (Bing.com Crawler bots)

3. SlurpBot (also famous as Yahoo! Slurp)

4. DuckDuckBot (The dutch search engine duckduckgo.com crawler bots)

5. Baiduspider (The Chineese most famous search engine used as a substitute of Google in China)

6. YandexBot (Russian Yandex Search engine crawler bots used in Russia as a substitute for Google )

7. Sogou Spider (leading Chineese Search Engine launched in 2004)

8. Exabot (A French Search Engine, launched in 2000, crawler for ExaLead Search Engine)

9. FaceBot (Facebook External hit, this crawler is crawling a certain webpage only once the user shares or paste link with video, music, blog whatever  in chat to another user)

10. Alexa Crawler (la_archiver is a web crawler for Amazon's Alexa Internet Rankings, Alexa is a great site to evaluate the approximate page popularity on the internet, Alexa SiteInfo page has historically been the Swift Army knife for anyone wanting to quickly evaluate a webpage approx. ranking while compared to other pages)

Above legitimate bots are known to follow most if not all of W3C – World Wide Web Consorium (W3.Org) standards and therefore, they respect the content commands for allowance or restrictions on a single site as given from robots.txt but unfortunately many of the so called Bad-Bots or Mirroring scripts that are burning your Web Server CPU and Memory mentioned in previous article are either not following /robots.txt prescriptions completely or partially.

Hence with the robots.txt unrespective bots, the case the only way to get rid of most of the webspiders that are just loading your bandwidth and server hardware is to filter / block them is by using Apache's mod_rewrite through

 

.htaccess


file

Create if not existing in the DocumentRoot of your website .htaccess file with whatever text editor, or create it your windows / mac os desktop and transfer via FTP / SecureFTP to server.

I prefer to do it directly on server with vim (text editor)

 

 

vim /var/www/sites/your-domain.com/.htaccess

 

RewriteEngine On

IndexIgnore .htaccess */.??* *~ *# */HEADER* */README* */_vti*

SetEnvIfNoCase User-Agent "^Black Hole” bad_bot
SetEnvIfNoCase User-Agent "^Titan bad_bot
SetEnvIfNoCase User-Agent "^WebStripper" bad_bot
SetEnvIfNoCase User-Agent "^NetMechanic" bad_bot
SetEnvIfNoCase User-Agent "^CherryPicker" bad_bot
SetEnvIfNoCase User-Agent "^EmailCollector" bad_bot
SetEnvIfNoCase User-Agent "^EmailSiphon" bad_bot
SetEnvIfNoCase User-Agent "^WebBandit" bad_bot
SetEnvIfNoCase User-Agent "^EmailWolf" bad_bot
SetEnvIfNoCase User-Agent "^ExtractorPro" bad_bot
SetEnvIfNoCase User-Agent "^CopyRightCheck" bad_bot
SetEnvIfNoCase User-Agent "^Crescent" bad_bot
SetEnvIfNoCase User-Agent "^Wget" bad_bot
SetEnvIfNoCase User-Agent "^SiteSnagger" bad_bot
SetEnvIfNoCase User-Agent "^ProWebWalker" bad_bot
SetEnvIfNoCase User-Agent "^CheeseBot" bad_bot
SetEnvIfNoCase User-Agent "^Teleport" bad_bot
SetEnvIfNoCase User-Agent "^TeleportPro" bad_bot
SetEnvIfNoCase User-Agent "^MIIxpc" bad_bot
SetEnvIfNoCase User-Agent "^Telesoft" bad_bot
SetEnvIfNoCase User-Agent "^Website Quester" bad_bot
SetEnvIfNoCase User-Agent "^WebZip" bad_bot
SetEnvIfNoCase User-Agent "^moget/2.1" bad_bot
SetEnvIfNoCase User-Agent "^WebZip/4.0" bad_bot
SetEnvIfNoCase User-Agent "^WebSauger" bad_bot
SetEnvIfNoCase User-Agent "^WebCopier" bad_bot
SetEnvIfNoCase User-Agent "^NetAnts" bad_bot
SetEnvIfNoCase User-Agent "^Mister PiX" bad_bot
SetEnvIfNoCase User-Agent "^WebAuto" bad_bot
SetEnvIfNoCase User-Agent "^TheNomad" bad_bot
SetEnvIfNoCase User-Agent "^WWW-Collector-E" bad_bot
SetEnvIfNoCase User-Agent "^RMA" bad_bot
SetEnvIfNoCase User-Agent "^libWeb/clsHTTP" bad_bot
SetEnvIfNoCase User-Agent "^asterias" bad_bot
SetEnvIfNoCase User-Agent "^httplib" bad_bot
SetEnvIfNoCase User-Agent "^turingos" bad_bot
SetEnvIfNoCase User-Agent "^spanner" bad_bot
SetEnvIfNoCase User-Agent "^InfoNaviRobot" bad_bot
SetEnvIfNoCase User-Agent "^Harvest/1.5" bad_bot
SetEnvIfNoCase User-Agent "Bullseye/1.0" bad_bot
SetEnvIfNoCase User-Agent "^Mozilla/4.0 (compatible; BullsEye; Windows 95)" bad_bot
SetEnvIfNoCase User-Agent "^Crescent Internet ToolPak HTTP OLE Control v.1.0" bad_bot
SetEnvIfNoCase User-Agent "^CherryPickerSE/1.0" bad_bot
SetEnvIfNoCase User-Agent "^CherryPicker /1.0" bad_bot
SetEnvIfNoCase User-Agent "^WebBandit/3.50" bad_bot
SetEnvIfNoCase User-Agent "^NICErsPRO" bad_bot
SetEnvIfNoCase User-Agent "^Microsoft URL Control – 5.01.4511" bad_bot
SetEnvIfNoCase User-Agent "^DittoSpyder" bad_bot
SetEnvIfNoCase User-Agent "^Foobot" bad_bot
SetEnvIfNoCase User-Agent "^WebmasterWorldForumBot" bad_bot
SetEnvIfNoCase User-Agent "^SpankBot" bad_bot
SetEnvIfNoCase User-Agent "^BotALot" bad_bot
SetEnvIfNoCase User-Agent "^lwp-trivial/1.34" bad_bot
SetEnvIfNoCase User-Agent "^lwp-trivial" bad_bot
SetEnvIfNoCase User-Agent "^Wget/1.6" bad_bot
SetEnvIfNoCase User-Agent "^BunnySlippers" bad_bot
SetEnvIfNoCase User-Agent "^Microsoft URL Control – 6.00.8169" bad_bot
SetEnvIfNoCase User-Agent "^URLy Warning" bad_bot
SetEnvIfNoCase User-Agent "^Wget/1.5.3" bad_bot
SetEnvIfNoCase User-Agent "^LinkWalker" bad_bot
SetEnvIfNoCase User-Agent "^cosmos" bad_bot
SetEnvIfNoCase User-Agent "^moget" bad_bot
SetEnvIfNoCase User-Agent "^hloader" bad_bot
SetEnvIfNoCase User-Agent "^humanlinks" bad_bot
SetEnvIfNoCase User-Agent "^LinkextractorPro" bad_bot
SetEnvIfNoCase User-Agent "^Offline Explorer" bad_bot
SetEnvIfNoCase User-Agent "^Mata Hari" bad_bot
SetEnvIfNoCase User-Agent "^LexiBot" bad_bot
SetEnvIfNoCase User-Agent "^Web Image Collector" bad_bot
SetEnvIfNoCase User-Agent "^The Intraformant" bad_bot
SetEnvIfNoCase User-Agent "^True_Robot/1.0" bad_bot
SetEnvIfNoCase User-Agent "^True_Robot" bad_bot
SetEnvIfNoCase User-Agent "^BlowFish/1.0" bad_bot
SetEnvIfNoCase User-Agent "^JennyBot" bad_bot
SetEnvIfNoCase User-Agent "^MIIxpc/4.2" bad_bot
SetEnvIfNoCase User-Agent "^BuiltBotTough" bad_bot
SetEnvIfNoCase User-Agent "^ProPowerBot/2.14" bad_bot
SetEnvIfNoCase User-Agent "^BackDoorBot/1.0" bad_bot
SetEnvIfNoCase User-Agent "^toCrawl/UrlDispatcher" bad_bot
SetEnvIfNoCase User-Agent "^WebEnhancer" bad_bot
SetEnvIfNoCase User-Agent "^TightTwatBot" bad_bot
SetEnvIfNoCase User-Agent "^suzuran" bad_bot
SetEnvIfNoCase User-Agent "^VCI WebViewer VCI WebViewer Win32" bad_bot
SetEnvIfNoCase User-Agent "^VCI" bad_bot
SetEnvIfNoCase User-Agent "^Szukacz/1.4" bad_bot
SetEnvIfNoCase User-Agent "^QueryN Metasearch" bad_bot
SetEnvIfNoCase User-Agent "^Openfind data gathere" bad_bot
SetEnvIfNoCase User-Agent "^Openfind" bad_bot
SetEnvIfNoCase User-Agent "^Xenu’s Link Sleuth 1.1c" bad_bot
SetEnvIfNoCase User-Agent "^Xenu’s" bad_bot
SetEnvIfNoCase User-Agent "^Zeus" bad_bot
SetEnvIfNoCase User-Agent "^RepoMonkey Bait & Tackle/v1.01" bad_bot
SetEnvIfNoCase User-Agent "^RepoMonkey" bad_bot
SetEnvIfNoCase User-Agent "^Zeus 32297 Webster Pro V2.9 Win32" bad_bot
SetEnvIfNoCase User-Agent "^Webster Pro" bad_bot
SetEnvIfNoCase User-Agent "^EroCrawler" bad_bot
SetEnvIfNoCase User-Agent "^LinkScan/8.1a Unix" bad_bot
SetEnvIfNoCase User-Agent "^Keyword Density/0.9" bad_bot
SetEnvIfNoCase User-Agent "^Kenjin Spider" bad_bot
SetEnvIfNoCase User-Agent "^Cegbfeieh" bad_bot

 

<Limit GET POST>
order allow,deny
allow from all
Deny from env=bad_bot
</Limit>

 


Above rules are Bad bots prohibition rules have RewriteEngine On directive included however for many websites this directive is enabled directly into VirtualHost section for domain/s, if that is your case you might also remove RewriteEngine on from .htaccess and still the prohibition rules of bad bots should continue to work
Above rules are also perfectly suitable wordpress based websites / blogs in case you need to filter out obstructive spiders even though the rules would work on any website domain with mod_rewrite enabled.

Once you have implemented above rules, you will not need to restart Apache, as .htaccess will be read dynamically by each client request to Webserver

2. Testing .htaccess Bad Bots Filtering Works as Expected


In order to test the new Bad Bot filtering configuration is working properly, you have a manual and more complicated way with lynx (text browser), assuming you have shell access to a Linux / BSD / *Nix computer, or you have your own *NIX server / desktop computer running
 

Here is how:
 

 

lynx -useragent="Mozilla/5.0 (compatible; MegaIndex.ru/2.0; +http://megaindex.com/crawler)" -head -dump http://www.your-website-filtering-bad-bots.com/

 

 

Note that lynx will provide a warning such as:

Warning: User-Agent string does not contain "Lynx" or "L_y_n_x"!

Just ignore it and press enter to continue.

Two other use cases with lynx, that I historically used heavily is to pretent with Lynx, you're GoogleBot in order to see how does Google actually see your website?
 

  • Pretend with Lynx You're GoogleBot

 

lynx -useragent="Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)" -head -dump http://www.your-domain.com/

 

 

  • How to Pretend with Lynx Browser You are GoogleBot-Mobile

 

lynx -useragent="Mozilla/5.0 (iPhone; U; CPU iPhone OS 4_1 like Mac OS X; en-us) AppleWebKit/532.9 (KHTML, like Gecko) Version/4.0.5 Mobile/8B117 Safari/6531.22.7 (compatible; Googlebot-Mobile/2.1; +http://www.google.com/bot.html)" -head -dump http://www.your-domain.com/

 


Or for the lazy ones that doesn't have Linux / *Nix at disposal you can use WannaBrowser website

Wannabrowseris a web based browser emulator which gives you the ability to change the User-Agent on each website req1uest, so just set your UserAgent to any bot browser that we just filtered for example set User-Agent to CheeseBot

The .htaccess rule earier added once detecting your browser client is coming in with the prohibit browser agent will immediately filter out and you'll be unable to access the website with a message like:
 

HTTP/1.1 403 Forbidden

 

Just as I've talked a lot about Index Bots, I think it is worthy to also mention three great websites that can give you a lot of Up to Date information on exact Spiders returned user-agent, common known Bot traits as well as a a current updated list with the Bad Bots etc.

Bot and Browser Resources information user-agents, bad-bots and odd Crawlers and Bots specifics

1. botreports.com
2. user-agents.org
3. useragentapi.com

 

An updated list with robots user-agents (crawler-user-agents) is also available in github here regularly updated by Caia Almeido

There are also a third party plugin (modules) available for Website Platforms like WordPress / Joomla / Typo3 etc.

Besides the listed on these websites as well as the known Bad and Good Bots, there are perhaps a hundred of others that might end up crawling your webdsite that might or might not need  to be filtered, therefore before proceeding with any filtering steps, it is generally a good idea to monitor your  HTTPD access.log / error.log, as if you happen to somehow mistakenly filter the wrong bot this might be a reason for Website Indexing Problems.

Hope this article give you some valueable information. Enjoy ! 🙂

 

Enable TLS 1.2 Internet Explorer / Make TLS 1.1 and TLS 1.2 web sites work on IE howto

Monday, August 1st, 2016

Internet-Explorer-cannot-display-the-webpage-IE-error
 

Some corporate websites and web tools especially one in DMZ-ed internal corporation networks require an encryption of TLS 1.2 (Transport Layer of Security cryptographic protocol)   TLS 1.1 protocol   both of which are already insecure (prone to vulnerabilities).

Besides the TLS 1.2 browser requirements some corporate tool web interfaces like Firewall Opening request tools etc. are often are very limited in browser compitability and built to only work with certain versions of Microsoft Internet Explorer like leys say IE (Internet Explorer) 11.

TLS 1.2 is supported across IE 8, 9, 10 and 11, so sooner or later you might be forced to reconfigure your Internet Explorer to have enabled the disabled by OS install TLS 1.2 / 1.1.

For those unaware of what TLS (Transport Layer of Security) protocol is so to say the next generation encryption protocol after SSL (Secure Socket Layer) also both TLS and SSL terms are being inter-exchangably used when referring with encrypting traffic between point (host / device etc.) A and B by using a key and a specific cryptographic algorithm.
TLS is usually more used historically in Mail Servers, even though as I said some web tools are starting to use TLS as a substitute for the SSL certificate browser encryption or even in conjunction with it.
For those who want to dig a little bit further into What is TLS? – read on technet here.

I had to enable TLS on IE and I guess sooner others will need a way to enable TLS 1.2 on Internet Explorer, so here is how this is done:
 

Enable-Internet-Explorer-TLS1.2-TLS-1.1-internet-options-IE-screensho
 


    1. On the Internet Explorer Main Menu (press Alt + F to make menu field appear)
    Select Tools > Internet Options.

    2. In the Internet Options box, select the Advanced tab.

    3. In the Security category, uncheck Use SSL 3.0 (if necessery) and Check the ticks:

    Use TLS 1.0,
    Use TLS 1.1 and Use TLS 1.2 (if available).

    4. Click OK
   
     5. Finally Exit browser and start again IE.

 

Once browser is relaunched, the website URL that earlier used to be showing Internet Explorer cannot display the webpagre can't connect / missing website error message will start opening normally.

Note that TLS 1.2 and 1.1 is not supported in Mozilla Firefox older browser releases though it is supported properly in current latest FF releases >=4.2.

If you  have fresh new 4.2 Firefox browser and you want to make sure it is really supporting TLS 1.1 and TLS 1.2 encrpytion:

 

(1) In a new tab, type or paste about:config in the address bar and press Enter/Return. Click the button promising to be careful.

(2) In the search box above the list, type or paste TLS and pause while the list is filtered

(3) If the security.tls.version.max preference is bolded and "user set" to a value other than 3, right-click > Reset the preference to restore the default value of 3

(4) If the security.tls.version.min preference is bolded and "user set" to a value other than 1, right-click > Reset the preference to restore the default value of 1

The values for these preferences mean:

1 => TLS 1.0 2 => TLS 1.1 3 => TLS 1.2


To get a more concrete and thorough information on the exact TLS / SSL cryptography cipher suits and protocol details supported by your browser check this link


N.B. ! TLS is by default disabled in many latest version browsers such as Opera, Safari etc.  in order to address the POODLE SSL / TLS cryptographic protocol vulnerability

Improve Apache Load Balancing with mod_cluster – Apaches to Tomcats Application servers Get Better Load Balancing

Thursday, March 31st, 2016

improve-apache-load-balancing-with-mod_cluster-apaches-to-tomcats-application-servers-get-better-load-balancing-mod_cluster-logo


Earlier I've blogged on How to set up Apache to to serve as a Load Balancer for 2, 3, 4  etc. Tomcat / other backend application servers with mod_proxy and mod_proxy_balancer, however though default Apache provided mod_proxy_balancer works fine most of the time, If you want a more precise and sophisticated balancing with better load distribuion you will probably want to install and use mod_cluster instead.

 

So what is Mod_Cluster and why use it instead of Apache proxy_balancer ?
 

Mod_cluster is an innovative Apache module for HTTP load balancing and proxying. It implements a communication channel between the load balancer and back-end nodes to make better load-balancing decisions and redistribute loads more evenly.

Why use mod_cluster instead of a traditional load balancer such as Apache's mod_balancer and mod_proxy or even a high-performance hardware balancer?

Thanks to its unique back-end communication channel, mod_cluster takes into account back-end servers' loads, and thus provides better and more precise load balancing tailored for JBoss and Tomcat servers. Mod_cluster also knows when an application is undeployed, and does not forward requests for its context (URL path) until its redeployment. And mod_cluster is easy to implement, use, and configure, requiring minimal configuration on the front-end Apache server and on the back-end servers.
 


So what is the advantage of mod_cluster vs mod proxy_balancer ?

Well here is few things that turns the scales  in favour for mod_cluster:

 

  •     advertises its presence via multicast so as workers can join without any configuration
     
  •     workers will report their available contexts
     
  •     mod_cluster will create proxies for these contexts automatically
     
  •     if you want to, you can still fine-tune this behaviour, e.g. so as .gif images are served from httpd and not from workers…
     
  •     most importantly: unlike pure mod_proxy or mod_jk, mod_cluster knows exactly how much load there is on each node because nodes are reporting their load back to the balancer via special messages
     
  •     default communication goes over AJP, you can use HTTP and HTTPS

 

1. How to install mod_cluster on Linux ?


You can use mod_cluster either with JBoss or Tomcat back-end servers. We'll install and configure mod_cluster with Tomcat under CentOS; using it with JBoss or on other Linux distributions is a similar process. I'll assume you already have at least one front-end Apache server and a few back-end Tomcat servers installed.

To install mod_cluster, first download the latest mod_cluster httpd binaries. Make sure to select the correct package for your hardware architecture – 32- or 64-bit.
Unpack the archive to create four new Apache module files: mod_advertise.so, mod_manager.so, mod_proxy_cluster.so, and mod_slotmem.so. We won't need mod_advertise.so; it advertises the location of the load balancer through multicast packets, but we will use a static address on each back-end server.

Copy the other three .so files to the default Apache modules directory (/etc/httpd/modules/ for CentOS).
Before loading the new modules in Apache you have to remove the default proxy balancer module (mod_proxy_balancer.so) because it is not compatible with mod_cluster.

Edit the Apache configuration file (/etc/httpd/conf/httpd.conf) and remove the line

 

LoadModule proxy_balancer_module modules/mod_proxy_balancer.so

 


Create a new configuration file and give it a name such as /etc/httpd/conf.d/mod_cluster.conf. Use it to load mod_cluster's modules:

 

 

 

LoadModule slotmem_module modules/mod_slotmem.so
LoadModule manager_module modules/mod_manager.so
LoadModule proxy_cluster_module modules/mod_proxy_cluster.so

In the same file add the rest of the settings you'll need for mod_cluster something like:

And for permissions and Virtualhost section

Listen 192.168.180.150:9999

<virtualhost  192.168.180.150:9999="">

    <directory>
        Order deny,allow
        Allow from all 192.168
    </directory>

    ManagerBalancerName mymodcluster
    EnableMCPMReceive
</virtualhost>

ProxyPass / balancer://mymodcluster/


The above directives create a new virtual host listening on port 9999 on the Apache server you want to use for load balancing, on which the load balancer will receive information from the back-end application servers. In this example, the virtual host is listening on IP address 192.168.204.203, and for security reasons it allows connections only from the 192.168.0.0/16 network.
The directive ManagerBalancerName defines the name of the cluster – mymodcluster in this example. The directive EnableMCPMReceive allows the back-end servers to send updates to the load balancer. The standard ProxyPass and ProxyPassReverse directives instruct Apache to proxy all requests to the mymodcluster balancer.
That's all you need for a minimal configuration of mod_cluster on the Apache load balancer. At next server restart Apache will automatically load the file mod_cluster.conf from the /etc/httpd/conf.d directory. To learn about more options that might be useful in specific scenarios, check mod_cluster's documentation.

While you're changing Apache configuration, you should probably set the log level in Apache to debug when you're getting started with mod_cluster, so that you can trace the communication between the front- and the back-end servers and troubleshoot problems more easily. To do so, edit Apache's configuration file and add the line LogLevel debug, then restart Apache.
 

2. How to set up Tomcat appserver for mod_cluster ?
 

Mod_cluster works with Tomcat version 6, 7 and 8, to set up the Tomcat back ends you have to deploy a few JAR files and make a change in Tomcat's server.xml configuration file.
The necessary JAR files extend Tomcat's default functionality so that it can communicate with the proxy load balancer. You can download the JAR file archive by clicking on "Java bundles" on the mod_cluster download page. It will be saved under the name mod_cluster-parent-1.2.6.Final-bin.tar.gz.

Create a new directory such as /root/java_bundles and extract the files from mod_cluster-parent-1.2.6.Final-bin.tar.gz there. Inside the directory /root/java_bundlesJBossWeb-Tomcat/lib/*.jar you will find all the necessary JAR files for Tomcat, including two Tomcat version-specific JAR files – mod_cluster-container-tomcat6-1.2.6.Final.jar for Tomcat 6 and mod_cluster-container-tomcat7-1.2.6.Final.jar for Tomcat 7. Delete the one that does not correspond to your Tomcat version.

Copy all the files from /root/java_bundlesJBossWeb-Tomcat/lib/ to your Tomcat lib directory – thus if you have installed Tomcat in

/srv/tomcat

run the command:

 

cp -rpf /root/java_bundles/JBossWeb-Tomcat/lib/* /srv/tomcat/lib/.

 

Then edit your Tomcat's server.xml file

/srv/tomcat/conf/server.xml.


After the default listeners add the following line:

 

<listener classname="org.jboss.modcluster.container.catalina.standalone.ModClusterListener" proxylist="192.168.204.203:9999"> </listener>



This instructs Tomcat to send its mod_cluster-related information to IP 192.168.180.150 on TCP port 9999, which is what we set up as Apache's dedicated vhost for mod_cluster.
While that's enough for a basic mod_cluster setup, you should also configure a unique, intuitive JVM route value on each Tomcat instance so that you can easily differentiate the nodes later. To do so, edit the server.xml file and extend the Engine property to contain a jvmRoute, like this:
 

.

 

<engine defaulthost="localhost" jvmroute="node2" name="Catalina"></engine>


Assign a different value, such as node2, to each Tomcat instance. Then restart Tomcat so that these settings take effect.

To confirm that everything is working as expected and that the Tomcat instance connects to the load balancer, grep Tomcat's log for the string "modcluster" (case-insensitive). You should see output similar to:

Mar 29, 2016 10:05:00 AM org.jboss.modcluster.ModClusterService init
INFO: MODCLUSTER000001: Initializing mod_cluster ${project.version}
Mar 29, 2016 10:05:17 AM org.jboss.modcluster.ModClusterService connectionEstablished
INFO: MODCLUSTER000012: Catalina connector will use /192.168.180.150


This shows that mod_cluster has been successfully initialized and that it will use the connector for 192.168.204.204, the configured IP address for the main listener.
Also check Apache's error log. You should see confirmation about the properly working back-end server:

[Tue Mar 29 10:05:00 2013] [debug] proxy_util.c(2026): proxy: ajp: has acquired connection for (192.168.204.204)
[Tue Mar 29 10:05:00 2013] [debug] proxy_util.c(2082): proxy: connecting ajp://192.168.180.150:8009/ to  192.168.180.150:8009
[Tue Mar 29 10:05:00 2013] [debug] proxy_util.c(2209): proxy: connected / to  192.168.180.150:8009
[Tue Mar 29 10:05:00 2013] [debug] mod_proxy_cluster.c(1366): proxy_cluster_try_pingpong: connected to backend
[Tue Mar 29 10:05:00 2013] [debug] mod_proxy_cluster.c(1089): ajp_cping_cpong: Done
[Tue Mar 29 10:05:00 2013] [debug] proxy_util.c(2044): proxy: ajp: has released connection for (192.168.180.150)


This Apache error log shows that an AJP connection with 192.168.204.204 was successfully established and confirms the working state of the node, then shows that the load balancer closed the connection after the successful attempt.

You can start testing by opening in a browser the example servlet SessionExample, which is available in a default installation of Tomcat.
Access this servlet through a browser at the URL http://balancer_address/examples/servlets/servlet/SessionExample. In your browser you should see first a session ID that contains the name of the back-end node that is serving your request – for instance, Session ID: 5D90CB3C0AA05CB5FE13121E4B23E670.node2.

Next, through the servlet's web form, create different session attributes. If you have a properly working load balancer with sticky sessions you should always (that is, until your current browser session expires) access the same node, with the previously created session attributes still available.

To test further to confirm load balancing is in place, at the same time open the same servlet from another browser. You should be redirected to another back-end server where you can conduct a similar session test.
As you can see, mod_cluster is easy to use and configure. Give it a try to address sporadic single-back-end overloads that cause overall application slowdowns.

Run native Internet Explorer 6 on latest Debian / Ubuntu Linux with IEs4Linux

Thursday, July 24th, 2014

Install-internet-explorer-on-debian-linux-IEs4Linux_logo.svg
If you're a GNU / Linux Desktop user like me and you have to administrate hybrid server environments running mixture of MS Windows with Microsoft IIS webserver running active server pages (.ASP) developed application or UNIX / GNU Linux servers web applications using Mono as a server-side language, often you need to have browser which properly supports  Internet Explorer Trident web (layout) renderer (also famous as MSHTML).

Having Internet Explorer on your Linux is very useful for web developers who want to test how their website works under IE.

Of course you can always install Windows in Virtualbox VM and do your testing in the Virtual Machine but this takes time to install and also puts a useless load to a PC ….

IES4 Linux is a Linux free (open source) shell script that lets you run Internet Explorer on your Linux desktop.

ies4linux scripts collection uses emulation with WINE (Wine is Not Emulator) emulator to run the native  Windows Internet Explorer thus before use it you have to install Wine.

There are plenty of tutorials online about ies4Linux, problem is as it is not updated and developed most tutorials doesn't work on Debian Wheezy / Ubuntu and rest of deb based linux distros.
This is why I decided to write just another ies4linux tutorial that actually works!

On Debian / Ubuntu / Mint Linux install via apt-get:

apt-get –yes install wine

Then with a non-root user download ies4linux-latest.tar.gz. Just in case ies4linux-latest.tar.gz disappears in future I've created also a ies4linux-latest.tar.gz mirror for download here

and unarchive tar archive:

wget http://www.tatanka.com.br/ies4linux/downloads/ies4linux-latest.tar.gz
tar -zxvf ies4linux-latest.tar.gz
cd ies4linux-*
./ies4linux

You will get:
 

IEs4Linux 2 is developed to be used with recent Wine versions (0.9.x). It seems that you are using an old version. It's recommended that you update your wine to the latest version (Go to: winehq.com).

You need to install cabextract first!
Download it here: http://www.kyz.uklinux.net/cabextract.php

To fulfill this requirement you will need to also cabetract package which is luckily part of Debian:
 

apt-get install –yes cabextract

On wine version 1.0 and onwards winprefixcreate has been changed to winecfg binary.
To prevent missing wineprefixcreate, errors during ies4linux installer run  its necessery to symlink as a workaround:
 

ln -sv /usr/bin/winecfg /usr/bin/wineprefixcreate


To continue with Internet Explorer ies4Linux installater run again:

./ies4linux

images/internet-explorer-for-linux-debian-gnu-linux-screenshot

/images/internet-explorer-4-linux-installer-debian-gnu-linu-screenshot

You will get the installer GUI window with selection option which Internet Explorer version you want. Choose between IE 5.0, IE 5.5 and IE 6. It is also possible to install IE 7 which is still considered beta version and is less tested and unstable, will probably lead to crashes. If you want to install also IE 7 check it as an option from Advanced menu.

/images/ies4linux-internet-explorer-installer-debian-gnu-linux-screenshot

If you get permission errors after running ies4Linux gui installer to solve that chown recursively directory to the user with which you will be running it:
 

chown -R hipo:hipo ies4linux-2.99.0.1

 

Internet Explorer for Linux downloader, will connect Microsoft.com website and download DCOM, MCF and various IE required .CAB files.

If you get some ies4linux GUI installer unexpected crashes you can try to download all required IE binaries, surrounding files and flash player using no-gui installer with cmd:
 

./ies4linux –no-gui –install-corefonts

IEs4Linux 2 is developed to be used with recent Wine versions (0.9.x). It seems that you are using an old version. It's recommended that you update your wine to the latest version (Go to: winehq.com).

IEs4Linux will:
  – Install Internet Explorers: 6.0
  – Using IE locale: EN-US
  – Install Adobe Flash 9.0
  – Install MS Core Fonts
  – Install everything at: /home/hipo/.ies4linux
[ OK ]

Downloading everything we need
  Downloading from microsoft.com:
   DCOM98.EXE
   mfc42.cab
   249973USA8.exe
   ADVAUTH.CAB
   CRLUPD.CAB
   HHUPD.CAB
   IEDOM.CAB
   IE_EXTRA.CAB
   IE_S1.CAB
   IE_S2.CAB
   IE_S5.CAB
   IE_S4.CAB
   IE_S3.CAB
   IE_S6.CAB
   SETUPW95.CAB
   FONTCORE.CAB
   FONTSUP.CAB
   VGX.CAB
   SCR56EN.CAB

  Downloading from macromedia.com:
   100% swflash.cab

  Downloading from sourceforge.net
   0%   webdin32.exe[ OK ]bdin32.exe

Installing IE 6
  Initializing
  Creating Wine Prefix
Your wine does not have wineprefixcreate installed. Maybe you are running an old Wine version. Try to update it to the latest version.

To fix the error:

Your wine does not have wineprefixcreate installed. Maybe you are running an old Wine version. Try to update it to the latest version.

vim lib/functiions.sh

Go to line 36 (Type :36 in vim)

Line:

wine –version 2>&1 | grep -q "0.9." || warning $MSG_WARNING_OLDWINE

Has to be changed to:

wine –version 2>&1 | egrep -q "0.9.|-1." || warning $MSG_WARNING_OLDWINE


Also you need to substitute wineprefixcreate to wineboot (if you haven't already symlinked wineprefixcreate to winecfg – as pointed earlier in article.

To do so make following substitution in lib/install.sh and in lib/functions.sh

cp -rpf lib/install.sh lib/install.sh.bak; cat lib/install.sh |sed -e 's#wineprefixcreate#wineboot#g' > lib/install_new.sh; mv lib/install_new.sh lib/install.sh

cp -rpf lib/install.sh lib/functions.sh.bak; cat lib/functions.sh |sed -e 's#wineprefixcreate#wineboot#g' > lib/functions_new.sh; mv lib/functions_new.sh lib/functions.sh


Also it is necessery to change default corefonts download url which points to sourceforge but is failing. I've made mirror of corefonts files here
 

cp -rpf lib/install.sh lib/install.sh.bak; cat lib/install.sh |sed -e 's#http://internap.dl.sourceforge.net/sourceforge/corefonts/#www.pc-freak.net/files/corefonts/#g' > lib/install_new.sh; mv lib/install_new.sh lib/install.sh

Re-run the ies4linux console installer:
 

 ./ies4linux –no-gui –install-corefonts

….
Es4Linux installations finished!

On installation success you should get output like this
Hopefully you will see no errors like in my case, if you get the corefonts download error again re-run the installer and it should succesully download the files.

To then run ies4linux:

~/bin/ie6


internet-explorer-ies4linx-running-on-debian-gnu-linux-screenshot
Though Ies 4 Linux is good for basic testing it is not psosible to use the browser for normal browsing because its a bit buggy and slow.

By default Internet Explorer 6 behavior is to prompt security alert on various actions, though this might be useful for debugging it is really annoying so I personally disabled those by decreasing from:

Tools -> Internet Options -> Security -> (Security Level)
I've decreased it from Medium to Medium-Low

ies4Linux was not developed since 2008 and as of time of writting ies4linux official project website seems abandoned.

 

Linux convert and read .mht (Microsoft html) file format. MHT format explained

Thursday, June 5th, 2014

linux-open-and-convert-mht-file-format-to-html-howto
If you're using Linux as a Desktop system sooner or later you will receive an email with instructions or an html page stored in .mht file format.
So what is mht? MHT is an webpage archive format (short for MIME HTML document). MHTML saves the Web page content and incorporates external resources, such as images, applets, Flash animations and so on, into HTML documents. Usually those .mht files were produced with Microsoft Internet Explorer – saving pages through:

File -> Save As (Save WebPage) dialog saves pages in .MHT.

To open those .mht files on Linux, where Firefox is available add the UNMHT FF Extension to browser. Besides allowing you to view MHT on Linux, whether some customer is requiring a copy of an HTML page in MHT, UNMHT allows you to also save complete web pages, including text and graphics, into a MHT file.
There is also support for Google Chrome browser for MHT opening and saving via a plugin called IETAB. But unfortunately IETAB is not supported in Linux.
Anyways IETAB is worthy to mention here as if your'e a Windows users and you want to browse pages compatible only with Internet Explorer, IETAB will emulates exactly IE by using IE rendering engine in Chrome  and supports Active X Controls. IETAB is a great extension for QA (web testers) using Windows for desktop who prefer to not use IE for security reasons. IETab supports IE6, IE7, IE8 and IE9.

Another way to convert .MHT content file into HTML is to use Linux KDE's mhttohtml tool.

linux-kde-converter-mhttohtml

Another approach to open .MHT files in Linux is to use Opera browser for Linux which has support for .MHT

Note that because MHT files could be storing potentially malicious content (like embedded Malware) it is always wise when opening MHT on Windows to assure you have scanned the file with Antivirus program. Often mails containing .MHT from unknown recipients are containing viruses or malware. Also links embedded into MHT file could easily expose you to spoof attacks. MHT files are encoded in combination of plain text MIMEs and BASE64 encoding scheme, MHT's mimetype is:

MIME type: message/rfc822
 

Use apt-get with Proxy howto – Set Proxy system-wide in Linux shell and Gnome

Friday, May 16th, 2014

linux-apt-get-configure-proxy-howto-set-proxy-systemwide-in-linux

I juset setup a VMWare Virtual Machine on my HP notebook and installed Debian 7.0 stable Wheezy. Though VMWare identified my Office Internet and configured automatically NAT, I couldn't access the internet from a browser until I remembered all HP traffic is going through a default set browser proxy.
After setting a proxy to Iceweasel, Internet pages started opening normally, however as every kind of traffic was also accessible via HP's proxy, package management with apt-get (apt-get update, apt-get install etc. were failing with errors):


# apt-get update

Ign cdrom://[Debian GNU/Linux 7.2.0 _Wheezy_ – Official i386 CD Binary-1 20131012-12:56] wheezy Release.gpg
Ign cdrom://[Debian GNU/Linux 7.2.0 _Wheezy_ – Official i386 CD Binary-1 20131012-12:56] wheezy Release
Ign cdrom://[Debian GNU/Linux 7.2.0 _Wheezy_ – Official i386 CD Binary-1 20131012-12:56] wheezy/main i386 Packages/DiffIndex
Ign cdrom://[Debian GNU/Linux 7.2.0 _Wheezy_ – Official i386 CD Binary-1 20131012-12:56] wheezy/main Translation-en_US
Err http://ftp.by.debian.org wheezy Release.gpg
  Could not connect to ftp.by.debian.org:80 (86.57.151.3). – connect (111: Connection refused)
Err http://ftp.by.debian.org wheezy-updates Release.gpg
  Unable to connect to ftp.by.debian.org:http:
Err http://security.debian.org wheezy/updates Release.gpg
  Cannot initiate the connection to security.debian.org:80 (2607:ea00:101:3c0b:207:e9ff:fe00:e595). – connect (101: Network is unreachable) [IP: 2607:ea00:101:3c0b:207:e9ff:fe00:e595 80]
Reading package lists…

 

This error is caused because apt-get is trying to directly access above http URLs and because port 80 is filtered out from HP Office, it fails in order to make it working I had to configure apt-get to use Proxy host – here is how:

a) Create /etc/apt/apt.conf.d/02proxy file (if not already existing)
and place inside:
 

Acquire::http::proxy::Proxy "https://web-proxy.cce.hp.com";
Acquire::ftp::proxy::Proxy "ftp://web-proxy.cce.hp.com";


To do it from console / gnome-terminal issue:
echo ''Acquire::http::Proxy "https://web-proxy.cce.hp.com:8088";' >> /etc/apt/apt.conf.d/02proxy
echo ''Acquire::ftp::Proxy "https://web-proxy.cce.hp.com:8088";' >> /etc/apt/apt.conf.d/02proxy

That's all now apt-get will tunnel all traffic via HTTP and FTP proxy host web-proxy.cce.hp.com and apt-get works again.

Talking about Proxyfing Linux's apt-get, its possible to also set proxy shell variables, which are red and understood by many console programs like Console browsers lynx, links, elinks  as well as wget and curl commands, e.g.:

 

export http_proxy=http://192.168.1.5:5187/
export https_proxy=$http_proxy
export ftp_proxy=$http_proxy
export rsync_proxy=$http_proxy
export no_proxy="localhost,127.0.0.1,localaddress,.localdomain.com"

For proxies protected with username and password export variables should look like so: echo -n "username:"
read -e username
echo -n "password:"
read -es password
export http_proxy="http://$username:$password@proxyserver:8080/"
export https_proxy=$http_proxy
export ftp_proxy=$http_proxy
export rsync_proxy=$http_proxy
export no_proxy="localhost,127.0.0.1,localaddress,.localdomain.com"

To make this Linux proxy settings system wide on Debian / Ubuntu there is the /etc/environment file add to it:
 

http_proxy=http://proxy.server.com:8080/
https_proxy=http://proxy.server.com:8080/
ftp_proxy=http://proxy.server.com:8080/
no_proxy="localhost,127.0.0.1,localaddress,.localdomain.com"
HTTP_PROXY=http://proxy.server.com:8080/
HTTPS_PROXY=http://proxy.server.com:8080/
FTP_PROXY=http://proxy.server.com:8080/
NO_PROXY="localhost,127.0.0.1,localaddress,.localdomain.com"


To make proxy global (systemwide) for most (non-Debian specific) Linux distributions shell environments create new file /etc/profile.d/proxy.sh and place something like:

function proxy(){
echo -n "username:"
read -e username
echo -n "password:"
read -es password
export http_proxy="http://$username:$password@proxyserver:8080/"
export https_proxy=$http_proxy
export ftp_proxy=$http_proxy
export rsync_proxy=$http_proxy
export no_proxy="localhost,127.0.0.1,localaddress,.localdomain.com"
echo -e "nProxy environment variable set."
}
function proxyoff(){
unset HTTP_PROXY
unset http_proxy
unset HTTPS_PROXY
unset https_proxy
unset FTP_PROXY
unset ftp_proxy
unset RSYNC_PROXY
unset rsync_proxy
echo -e "nProxy environment variable removed."
}

To set Global Proxy (make Proxy Systemwide) for a user in GNOME Desktop environment launch gnome-control-center

And go to Network -> Network Proxy

/images/gnome-configure-systemwide-proxy-howto-picture1

/images/gnome-configure-systemwide-proxy-howto-picture2

To make proxy settings also system wide for some GUI Gnome GTK3 applications

gsettings set org.gnome.system.proxy mode 'manual'
gsettings set org.gnome.system.proxy.http host 'your-proxy.server.com'
gsettings set org.gnome.system.proxy.http port 8080

How to install Adobe FlashPlayer Firefox browser plugin on FreeBSD 7.2 and higher

Monday, October 1st, 2012

Install linux_base FreeBSD port either using binary pre-compiled one or compiling via port tree.

1. Install and set up linux_base to load on FreeBSD boot


freebsd# pkg_add -vr linux_base
Opening BINARY mode data connection for linux_base.tbz (31858826 bytes).
Fetching ftp://ftp-archive.freebsd.org/pub/FreeBSD-Archive/ports/i386/packages-7.2-release/Latest/linux_base.tbz...^CSignal 2 received, cleaning up..

Or via port tree with cmd:


cd /usr/ports/emulators/linux_base-f10 && make install clean

Next add linprocfs to /etc/fstab:


freebsd# echo 'linproc /compat/linux/proc linprocfs rw 0 0' >> /etc/fstab

Mount linproc virtual filesystem:


freebsd# mount -a

2. Set linux_base to auto load on startup via /etc/rc.conf


echo 'linux_enable="YES"' >> /etc/rc.conf

3. Install other libraries on which ndislpuginwrapper and flash player depend

For me it was necessery to install linux-pango and linux-tiff, which were missing. For other people it is likely other packages on which flash pluguin and ndispluginwrapper is dependent to be missing. If that’s your case just install the required ones pkg_add-ing them 🙂


pkg_add -vr linux-pango
....
pkg_add -vr linux-tiff
....

4. Start ABI emulation and set sysctl linux variables

Make sure ABI Linux Binaries is enabled and sysctl variables for the emulated Linux kernel (via fbsd external module) are started:


freebsd# /etc/rc.d/abi start
Additional ABI support: linux.
freebsd# /etc/rc.d/sysctl start
kern.maxfiles: 50000 -> 65535
kern.maxfilesperproc: 50000 -> 12000
kern.maxfilesperproc: 12000 -> 50000
kern.maxfiles: 65535 -> 50000

5. Set some shell and sysctl variables before installing ndiswrapper and flash player

Export OVERRIDE_LINUX_BASE_PORT and OVERRIDE_LINUX_NONBASE_PORTS shell variables before installing the respective flash player. I install flash player 10 which is relatively stable on FBSD for newer flash plugins, change the var to whatever FP version.


freebsd# setenv OVERRIDE_LINUX_BASE_PORT f10
freebsd# setenv OVERRIDE_LINUX_NONBASE_PORTS f10

It is also needed to set compat.linux.osrelease=2.6.19 sysctl variable.


freebsd# sysctl compat.linux.osrelease=2.6.19

6. Install from ports ndispluginwrapper and flashplugin 10

Now installing the Flashplayer is done via flash plugin port and nspluginwrapper:


freebsd# cd /usr/ports/www/linux-f10-flashplugin10 && make install clean
....
freebsd# cd /usr/ports/www/nspluginwrapper && make install clean
....

BTW, nspluginwrapper is required because the flash player is not natively compiled to run on FreeBSD but a Linux binary.

It is also good idea to add OVERRIDE_LINUX_BASE_PORT=f10, OVERRIDE_LINUX_NONBASE_PORTS=f10 to /etc/make.conf to make the settings permanent:


freebsd# echo 'OVERRIDE_LINUX_BASE_PORT=f10' >> /etc/make.conf
freebsd# echo 'OVERRIDE_LINUX_NONBASE_PORTS=f10' >> /etc/make.conf

7. Adding Firefox, Opera flash plugin support for users

If after installing flash plugin and restarting GNOME in a certain user still t

Install ShellInABox (web shell browser AJAX frontend) on Debian GNU / Linux

Sunday, September 23rd, 2012

ShellinAbox web ssh shell browser frontend for Debian, Ubuntu, Arch Linux, Redhat and other GNU / Linux distributions

ShellInABox is a tiny piece of soft which can enable you to access your server or desktop via ssh shell using the web command line shell through AJAX interface. Installing it is not a hard task. To install on any Linux just navigate to shellinabox.com and download compile and install using the source code from tar.gz.
Installing ShellinaBox on Debian or Ubuntu and derivative based Linux it is even easier as on the website there are pre-compiled deb binaries which can be straight installed with dpkg

For 32 bit Debian version, installation is as simple as;

1. Download the i386 deb binary from Shellinabox.com
Just go to the website and look up for correct link and download with links

As of time of writting this post to download with links text browser:


links "http://code.google.com/p/shellinabox/downloads/detail?name=shellinabox_2.9-1_i386.deb&can=2&q="

2. Install deb pack with dpkg


dpkg -i shellinabox_2.9-1_i386.deb

For 64 bit amd64 bit arch Debian, install a Pre-built Debian x86-64 package (requires Ubuntu Karmic). Though the binary is said to be for Ubuntu it also installs and starts the shellinabox service (daemon) without no problem. By default shellinabox is configured to work on port number 4200. Right after install to test it open your favourite browser and do request to localhost port 4200:


http://127.0.0.1:4200/

BTW, I’ve used a couple of others Java based web ssh frontends and I should say, ShellinAbox is much more responsive.
Well that’s all now enjoy connecting to remote system ssh using any AJAX supporting browser 🙂

How to delete your linkedin account – I don’t want to be LINKED IN!

Tuesday, April 24th, 2012

I don't want to be linkedin, Linkedin is a fake and non-sense time wasting anti-social network

I've decided to delete my linkedin account as I don't see any good in constact connectiodness and being part of many "social" networks which if one thinks in deeply are not social but anti-social.

You just stay at home staring at a screen and it will be like this until the end of your days and even worser for the generations to come. Computer revolution or digital revolution is in reality huge devolutin (devil-lution)

To delete the linkedin account I used a short tutorial provided by This post

How to delete your linkedin account picture

TO reach to your Profile settings, use upper right corner of your browser and follow the menus:

Settings -> Account -> Close your account

Once, trying to delete your account, linkedin will try to manipulate you to stay in Linkedin by pushing some of your contacts, pointing how you will get disconnected from him.
I'm amazed how impudent this guys can be, actually, its not just them. If you have tried or deleted your facebook account before time you will have faced, exactly the same thing. A profile (person picture) which was recently browsed by you will be shown to you and be said you will be unable to connect with him any more. Well who cares if it is God's will we will connect again 🙂

The problem with us modern people is we're so deluded that we have started relying more on technology and human knowledge than to God. For most people who are atheists relying more on technology than on God for their lives seems reasanable However for us Christians putting more trust in technology than in Gods providence for us is sinful and deadly.
I'm starting to get the conclusion, non-technological societies are more happier than technological ones. In that sense, we the Bulgarians are blessed, because technology is not so widely spread.

PHP system(); hide command output – How to hide displayed output with exec();

Saturday, April 7th, 2012

I've recently wanted to use PHP's embedded system(""); – external command execute function in order to use ls + wc to calculate the number of files stored in a directory. I know many would argue, this is not a good practice and from a performance view point it is absolutely bad idea. However as I was lazy to code ti in PHP, I used the below line of code to do the task:

<?
echo "Hello, ";
$line_count = system("ls -1 /dir/|wc -l");
echo "File count in /dir is $line_count \n";
?>

This example worked fine for me to calculate the number of files in my /dir, but unfortunately the execution output was also visialized in the browser. It seems this is some kind of default behaviour in both libphp and php cli. I didn't liked the behaviour so I checked online for a solution to prevent the system(); from printing its output.

What I found as a recommendations on many pages is instead of system(); to prevent command execution output one should use exec();.
Therefore I used instead of my above code:

<?
echo "Hello, ";
$line_count = exec("ls -1 /dir/|wc -l");
echo "File count in /dir is $line_count \n";
?>

By the way insetad of using exec();, it is also possible to just use ` (backtick) – in same way like in bash scripting's .

Hence the above code can be also written for short like this:

<?
echo "Hello, ";
$line_count = `ls -1 /dir/|wc -l`;
echo "File count in /dir is $line_count \n";
?>

🙂