Posts Tagged ‘website’

Must have software on freshly installed windows – Essential Software after fresh Windows install

Friday, March 18th, 2016

Install-update-multiple-programs-applications-at-once-using-ninite

If you're into IT industry even if you don't like installing frequently Windows or you're completely Linux / BSD user, you will certainly have a lot of friends which will want help from you to re-install or fix their Windows 7 / 8 / 10 OS. At least this is the case with me every year, I'm kinda of obliged to install fresh windowses on new bought friends or relatives notebooks / desktop PCs.

Of course according to for whom the new Windows OS installed the preferrences of necessery software varies, however more or less there is sort of standard list of Windows Software which is used daily by most of Avarage Computer user, such as:
 

Not to forget a good candidate from the list to install on new fresh windows Installation candidates are:

  • Winrar
  • PeaZIP
  • WinZip
  • GreenShot (to be able to easily screenshot stuff and save pictures locally and to the cloud)
  • AnyDesk (non free but very functional alternative to TeamViewer) to be able to remotely access remote PC
  • TightVNC
  • ITunes / Spotify (for people who have also iPhone smart phone)
  • DropBox or pCloud (to have some extra cloud free space)
  • FBReader (for those reading a lot of books in different formats)
  • Rufus – Rufus is an efficient and lightweight tool to create bootable USB drives. It helps you to create BIOS or UEFI bootable devices. It helps you to create Windows TO Go drives. It provides support for various disk, format, and partition.
  • Recuva is a data recovery software for Windows 10 (non free)
  • EaseUS (for specific backup / restore data purposes but unfortunately (non free)
  • For designers
  • Adobe Photoshop
  • Adobe Illustrator
  • f.lux –  to control brightness of screen and potentially Save your eyes
  • ImDisk virtual Disk Driver
  • KeePass / PasswordSafe – to Securely store your passwords
  • Putty / MobaXterm / SecureCRT / mPutty (for system administrators and programmers that has to deal with Linux / UNIX)

I tend to install on New Windows installs and thus I have more or less systematized the process.

I try to usually stick to free software where possible for each of the above categories as a Free Software enthusiast and luckily nowadays there is a lot of non-priprietary or at least free as in beer software available out there.

For Windows sysadmins or College and other public institutions networks including multiple of Windows Computers which are not inside a domain and also for people in computer repair shops where daily dozens of windows pre-installs or a set of software Automatic updates are  necessery make sure to take a look at Ninite

ninite-automate-windows-program-deploy-and-update-on-new-windows-os-openoffice-screenshot

As official website introduces Ninite:

Ninite – Install and Update All Your Programs at Once

Of course as Ninite is used by organizations as NASA, Harvard Medical School etc. it is likely the tool might reports your installed list of Windows software and various other Win PC statistical data to Ninite developers and most likely NSA, but this probably doesn't much matter as this is probably by the moment you choose to have installed a Windows OS on your PC.

ninite-choises-to-build-an-install-package-with-useful-essential-windows-software-screenshot
 

For Windows System Administrators managing small and middle sized network PCs that are not inside a Domain Controller, Ninite could definitely save hours and at cases even days of boring install and maintainance work. HP Enterprise or HP Inc. Employees or ex-employees would definitely love Ninite, because what Ninite does is pretty much like the well known HP Internal Tool PC COE.

Ninite could also prepare an installer containing multiple applications based on the choice on Ninite's website, so that's also a great thing especially if you need to deploy a different type of Users PCs (Scientific / Gamers / Working etc.)

Perhaps there are also other useful things to install on a new fresh Windows installations, if you're using something I'm missing let me know in comments.

Fix “There Has Been a Critical Error on Your Website” wordpress error

Friday, December 2nd, 2022

there-has-been-a-critical-error-on-your-website-wordpress-critical-error-fix

Say you have a shiny working WordPress based website withtout any monitoring set for years but suddenly, you open the site and you get the terrifying error:
 

There Has Been a Critical Error on Your Website

That is quite of a stress for sure. As in the first few minutes you don't understand how this has happened since, you did not touched the perfeclty working site for a very, very long time.
Then you start to debug into the apache / nginx access.log, error.log and mysql mysql.err etc. franticly trying to figure it out the normal ideas pop-up immediately into mind, whether you have a recent backup for the website's database. If you have pair of high availability webservers service or backup databases that serve the traffic via a separate standby instance of the service, you might try to switch off the official service and see whether the standby Webserver / SQL server instance would serve the website fine.
However, if this is not an option and you have no standby backup service as a recovery Plan B option already set. Your only option is to continue to debug what is wrong.
Then the next thing to do is to check whether you don't have a Web Caching or Proxy in front of your webservers that are preventing you to see a recent version of the website and give you some old cache or you don't have an ISP proxy that is giving you some unreal results. That is easily seenable from the Webserver logs. If this is neither the case the next thing is to:
 

Enable WordPress (wp-config.php) Debug mode

By default for Security reasons the WordPress PHP execution debug mode is switched off inside wp-config.php.
When there are odd pages with the WordPress based blog or site however this can easily be changed by modifying the WP_DEBUG true|false value.

To do so edit with a text editor such as vim / nano / mcedit  wp-config.php or if no SSH access to the remote machine, use SFTP / FTP transfer protocol copy the file to your desktop and inspect it and make sure the WP_DEBUG / WP_DEBUG_DISPLAY / WP_DEBUG_LOG has following values:

define( 'WP_DEBUG', true );

define( 'WP_DEBUG_DISPLAY', false );

define( 'WP_DEBUG_LOG', true );

Reloading the Browser window tab with There is a critical error on Your website, you should get some Errors or Warnings like:
 

Warning: Illegal string offset 'parent_slug' in /var/www/websitecom/wp-content/plugins/photo-gallery/booster/main.php on line 180

Warning: Illegal string offset 'slug' in /var/www/websitecom/wp-content/plugins/photo-gallery/booster/main.php on line 180

 

Then you can temporary disable the problematic problem in that case for example the photo-gallery and recheck the website, and then restore from backup snapshot the respective plugin files version from a moment, when the website was working.

If this doesn't solve it and more plugins are crashing and you can't find an easy way to work-around it you miss a backup, you might try to

 

Disable all WordPress active plugins

Disable your plugins from the dashboard, visit Plugins > Installed Plugins and tick the checkbox at the top of the list to select them all.
Then click Bulk Actions -> Deactivate, which should be enough to disable any conflicts and restore your site.

You can do essentially the same thing through SSH / FTP session.

Step 1: Log in to your site with SSH / FTP.
Step 2: Open the wp-content folder to find your plugins.
Step 3: Rename the plugins folder to plugins_old and verify that your site is working again via SSH run commands:

# cd  path_to/plugins; mv plugins plugins_old

or rename via FTP client
Step 4: Rename the folder back to “plugins”. The plugins should be disabled still, so you should be able to log in to your dashboard and activate them one by one. If
the plugins reactivate automatically, rename individual plugin folders with _old until your site is restored.

Raise the PHP Memory Limit

Sometimes, a low PHP limitation causes critical errors on WP based blogs and sites, if necessery raise up the memory limitation via:

define( 'WP_MEMORY_LIMIT', '128M' );

Change Max Upload File Size and Text Processing function limits

To increase the max upload file size, add this code to wp-config.php:

ini_set('upload_max_size' , '256M' );

ini_set('post_max_size','256M');

And to fix the breaking of large pages on your site, add this code:

ini_set('pcre.recursion_limit',20000000);
ini_set('pcre.backtrack_limit',10000000);

Clear up any caches

If you use some session caching of the website on the machine such as memcached / ncache / redis / varnish or an haproxy or any proxy in front of the webserver to do some kind of High availability could produce strange  unexpected Critical errors on Your Website, thus restarting such services or cleaning up any cache would be advisable if you have such.
 

What Causes "There Has Been a Critical Error on Your Website" error?


The reason could be practically anything as WP is a kind of multi-comonent free and a bit of bloatware. The general ones could be  from a missing database table / table fields to a messed up plugin after update a disappeared critical plugin or essential wordpress PHP file, but in my specific case the reason was simple the Plugins Auto-update, which I have had the stupidity to enable.

The WordPress Automatic Updates, though saving you effort and Protecting your website in most cases against recent bugs and Exploits and increasing the WP security level, often causes issues and from my personal experience it is not recommended so better avoid it. Again next time you implement any automation to your server make sure you put some kind of monitoring.

Even if you decide to enable it make sure you do it the right way and not like me, by enabling some Monitoring to the WordPress site via Zabbix / Nagios / Cacti / monit  etc to be sure you get notified immediately if the WordPress based site is down.

SEO: Best day and time to write new articles and tweet to get more blog reads – Social Network Timing

Tuesday, June 17th, 2014

what-is-best-time-to-write-articles-to-increase-your-blog-traffic

I'm trying to regularly blog – as this gives me a roadmap what I'm into and how I spent my time. When have free time,  I blog almost daily except on weekends (as in weekends I'm trying to stay away from computers). So if you want to attract more readers to your blog the interesting question arises
 

What time is best to hit publish on your posts?

writing-in-the-mogning-on-the-internet-timing-morning-is-best-for-your-posts
Now there are different angles from where you can extract conclusions on best timing to blog post.One major thing to consider always when posting is that highest percentage of users read blogs in the morning with their morning coffee. Here are some more facts on when web content is more red:

  • 70% of users say they read blogs in the morning
  • More men read blogs at night than woman
  • Mondays are the highest traffic days for avarage blogs
  • 11 a.m. is normally the highest traffic hour for blogs
  • Usually most comments are put on Saturdays
  • Blogs with more than one post a day has higher chance of inbound links and usually get more unique visitors

As my blog is more technical oriented most of my visitors are men and therefore posting my blogs at night doesn't interfere much with my readers.
However, I've noticed that for me personally posting in time interval from 13:00 to 17:00 influence positively the amount of unique visitors the blog gets.

According to research done by Social Fresh – Thursday is the best day to publish an article if you want to get more Social SharesBest-Day-to-Blog-to-get-more-shares-in-social-networks

As a rule of thumb Thursday wins 10% more shares than all other days. In fact, 31% of the top 100 social share days in 2011 fell on Thursday.
My logical explanation on this phenomenon is that people tend to be more and more bored from their work and try to entertain more and more as the week progresses.

To get more attention on what I'm writting I use a bit of social networking but I prefer using only a micro blogging social networking.  I use Twitter to share what I'm into. When I write a new article on my blog I tweet its title with a link to my article, because this drives people attention to what I have to say.

In overall I am skeptical about social siting like Facebook and MySpace because it has negative impact on how people use their time and especially negative on youngsters Other reason why I don't like Friends Networks is because sharing what you have to say on sites like FB, Google+ or "The Russian Facebook" –  Vkontekte VK.com are not respecting privacy of your data.

 

You write free fresh content for their website for free and you get nothing!

 

Moreover by daily posting latest buzz you read / watched on Facebook etc. or simply saying what's happening with you, where you're situated now etc., you slowly get addicted to posting – yes for good or bad people tend to be maniacal).

By placing all of your pesronal or impersonal stuff online, you're making these sites better index their sites into Google / Yahoo / Yandex search engines and therefore making them profitable and high ranked websites on the internet and giving out your personal time for Facebook profit? + you loose control over your data (your data is not physically on your side but situated on some remote server, somewhere on the internet).
 

Best avarage time to post on Tweet Facebook, Google+ and Linkedin

best-time-and-day-to-write-new-articles-schedule-content-at-the-right-time-on-social-media-to-get-high-trafficrank

So What is Best Day timing to Post, Pin or Tweet?

Below is an infographic I fond on this blog (visual data is originalcompiled by SurePayRoll) and showing visualized results from some extensive research on the topic.

best-time-to-post-and-tweet-blog-articles-social-media-infographic


Here is most important facts this infographic reveals:


The avarage best time to post tweet and pin your new articles is about 15:00 h
 

  • Best timing to post on Twitter is on Mondays to Thursdays from 13:00 to 15:00 h
  • Best timing to post on facebook is between 13:00 and 16:00 h
  • For Linkedin it is best to place your publish between Tuesdays to Thursdays


Peak times on Facebook, Twitter and Linkedin

  • Peak times for use of Facebook is on Wednesdays about 15:00 h
  • Peak times for use of Twitter is from Monday to Thursdays from 9:00  to 15:00 h
  • Linkedin Peak time is from 17:00 to 18:00 h
  • Including images to your articles increases traffic, tweets with images increase visits, favorites and leads


Worst time (when users will probably not view your content) on FB, Twitter and Linkedin

  • Weekends before 08:00  and after 20:00 h
  • Everyday after 20:00 and Fridays after 15:00 noon
  • Mondays and Fridays from 22:00 to 06:00 morning

Facts about Google+
 

  • Google+ is the fastest growing demographic social network for people aged 45 to 54
  • Best time to share your posts on Google+ is from 09:00 to 10:00 in the morning
  • Including images to your articles increases traffic, tweets with images increase visits, favorites and leads
     

Images generate more traffic and engagement

  • Including images to your articles increases traffic, tweets with images increase visits, favorites and leads


I'm aware as every research above info on best time to tweet and post is just a generalization and according to field of information posted suggested time could be different from optiomal time for individual writer, however as a general direction, info is very useful and it gives you some idea.
Twitter engagement for brands is 17% higher on weekends according to Dan Zarrella’s research. Tweets posted on Friday, Saturday and Sunday had higher CTR (Click Through Rate) than those posted in the rest of the week.

tweet-on-the-weekends-is-better-for-high-click-through-rate

Other best day to tweet other than weekends is mid-week time Wednesday.
Whether your site or blog is using retweet to generate more traffic to website best time to retweet is said to be around 5 pm. CTR is higher

Use multiple certificates using one IP address (same IP address) on IIS Windows web server

Saturday, October 24th, 2020

If you had to administer some Windows webservers based on IIS and you're coming from the Linux realm, it would be really confusing on how you can use a single IP address to have binded multiple domain certificates.

For those who have done it on linux, they know Apache and other webservers in recent versions support the configuration Directive of a Wildcard instead of IP through the SNI extension capble to capture in the header of the incoming SSL connection the exact domain and match it correctly against the domain with the respective certificate.  Below is what I mean, lets say you have a website called yourdomain.com and you want this domain to be pointing to another location for example to yourdomain1.com

For example in Apache Webserver this is easily done by defining 2 separate virtualhost configuration files similar to below:

/etc/apache2/sites-available/yourdomain.com

<Virtualhost *>

Servername yourdomain.com
ServerAlias www.yourdomain.com
….

        SSLCertificateFile /etc/letsencrypt/live/yourdomain1.com/fullchain.pem
SSLCertificateKeyFile /etc/letsencrypt/live/yourdomain1.com/privkey.pem
</VirtualHost>


 

/etc/apache2/sites-available/yourdomain1.com

<Virtualhost *>

Servername yourdomain1.com
ServerAlias yourdomain1.com

 

        SSLCertificateFile /etc/letsencrypt/live/yourdomain1.com/fullchain.pem
SSLCertificateKeyFile /etc/letsencrypt/live/yourdomain1.com/privkey.pem
</VirtualHost>

 

Unfortunately for those who still run legacy Windows servers  with IIS version 7 / 7.5 your only option is to use separate IP addresses (or ports, but not really acceptable for public facing sites) and to bind each site with it's SSL certificate to that IP address.

IIS ver. 8+ supports the Server Name Indication extension of TLS which will allow you to bind multiple SSL sites to the same IP address/port based on the host name. It will be transparent and the binding will work the same as with non-HTTPS sites.

In Microsoft IIS Webserver to configure, it is not possible to simply edit some configurations but you have to do it the clicking way as usually happen in Windows. thus you will need to have generated the Domain Certificate requests and so on and then you can simply do as pointed in below screenshots.

howto-install-iis-8-webserver-ssl-sni-certificate-windows-screenshot
 

iis-config-domain-alias-windows-server-iis-8-webserver

iis-config-domain-alias-windows-server-iis-8-webserver-1

iis-config-domain-alias-windows-server-iis-8-webserver-2

iis-config-domain-alias-windows-server-iis-8-webserver-3

iis-config-domain-alias-windows-server-iis-8-webserver-4
 

Fixing Disappeared intel net link 5100 wifi on Toshiba Satellite L300 1YA

Friday, October 29th, 2010

I've recently had to fix a Toshiba Satellite L300 1YA notebook , running the shitty Windows Vista operating system.
For some reason suddenly the Net Link 5100 Wi-FI Network Adapter of the laptop mysteriously disappeared.
The led indicating the device is enabled was blinking the driver showed properly installed in Windows -> System and everything.
Even though that the wireless network scanning in the notebook was not working.

In networking in System there were two lines showing up with excalammation mark "!".

To fix the mess it took me 3.5 hours. I tried many things first logical thing I tried was reinstalling the driver with the latest available from Intel's website. Anyways this doesn't helped so I was about to think about other solutions.
What made thinks even worse was that the Vista installed was in Chineese!, yes you red this correctly chineese. You cannot imagine what a hell it is to deal with Windows whose language pack was Chineese …

The Vista had installed also the Chineese antivirus program Rising which also provided the system with some weird firewall and this made the dealing with the problem even more complicated.

After many tries I finally completely removed the wireless driver on the system and reinstalled it with the latest version from Intel's website.
Thanksfully after a couple of reboots and going into save mode the Intel Net Link 5100 started working again by itself?!

Well you know how things goes with Windows, you never know what will happen next.

Maybe a lot of notebooks suffer the same weird issue with Wireless Wi-Fi adapter suddenly stopping to work.

So now you know the solution, remove the driver install it again, restart and it should be working again.
Hope this quick and dirty article will save somebody an hours of time to figure it out …

Run Apache with SSL Self Signed SSL Certificate

Friday, August 14th, 2009

Recently I had to run apache on Debian 4.0 (Edge) with Self Signed certificate.To make it happen I had to Google around and try out stuff. I've red that Debiancomes with a command (apache2-ssl-certificate) that generates a self signed openssl certificate.However on my Debian systems this cmd wasn't available. So I had to google around about it,and I came along the following website which provided mewith the script itself and some instructions how to use it. I've modified a bit the archive mentionedon the above website to make the install instructions of the website through a script. I've built a newarchive based on the archive apache2-ssl.tar.gz that includes an extra file runme.sh which does the properinstallation for you. The new archive itself could be found here .

In the mean time I recommend you read my article explaining how to quickly and efficiently generate self-signed certificate with openssl command on GNU / Linux and BSD

END—–

Block Web server over loading Bad Crawler Bots and Search Engine Spiders with .htaccess rules

Monday, September 18th, 2017

howto-block-webserver-overloading-bad-crawler-bots-spiders-with-htaccess-modrewrite-rules-file

In last post, I've talked about the problem of Search Index Crawler Robots aggressively crawling websites and how to stop them (the article is here) explaning how to raise delays between Bot URL requests to website and how to completely probhit some bots from crawling with robots.txt.

As explained in article the consequence of too many badly written or agressive behaviour Spider is the "server stoning" and therefore degraded Web Server performance as a cause or even a short time Denial of Service Attack, depending on how well was the initial Server Scaling done.

The bots we want to filter are not to be confused with the legitimate bots, that drives real traffic to your website, just for information

 The 10 Most Popular WebCrawlers Bots as of time of writting are:
 

1. GoogleBot (The Google Crawler bots, funnily bots become less active on Saturday and Sundays :))

2. BingBot (Bing.com Crawler bots)

3. SlurpBot (also famous as Yahoo! Slurp)

4. DuckDuckBot (The dutch search engine duckduckgo.com crawler bots)

5. Baiduspider (The Chineese most famous search engine used as a substitute of Google in China)

6. YandexBot (Russian Yandex Search engine crawler bots used in Russia as a substitute for Google )

7. Sogou Spider (leading Chineese Search Engine launched in 2004)

8. Exabot (A French Search Engine, launched in 2000, crawler for ExaLead Search Engine)

9. FaceBot (Facebook External hit, this crawler is crawling a certain webpage only once the user shares or paste link with video, music, blog whatever  in chat to another user)

10. Alexa Crawler (la_archiver is a web crawler for Amazon's Alexa Internet Rankings, Alexa is a great site to evaluate the approximate page popularity on the internet, Alexa SiteInfo page has historically been the Swift Army knife for anyone wanting to quickly evaluate a webpage approx. ranking while compared to other pages)

Above legitimate bots are known to follow most if not all of W3C – World Wide Web Consorium (W3.Org) standards and therefore, they respect the content commands for allowance or restrictions on a single site as given from robots.txt but unfortunately many of the so called Bad-Bots or Mirroring scripts that are burning your Web Server CPU and Memory mentioned in previous article are either not following /robots.txt prescriptions completely or partially.

Hence with the robots.txt unrespective bots, the case the only way to get rid of most of the webspiders that are just loading your bandwidth and server hardware is to filter / block them is by using Apache's mod_rewrite through

 

.htaccess


file

Create if not existing in the DocumentRoot of your website .htaccess file with whatever text editor, or create it your windows / mac os desktop and transfer via FTP / SecureFTP to server.

I prefer to do it directly on server with vim (text editor)

 

 

vim /var/www/sites/your-domain.com/.htaccess

 

RewriteEngine On

IndexIgnore .htaccess */.??* *~ *# */HEADER* */README* */_vti*

SetEnvIfNoCase User-Agent "^Black Hole” bad_bot
SetEnvIfNoCase User-Agent "^Titan bad_bot
SetEnvIfNoCase User-Agent "^WebStripper" bad_bot
SetEnvIfNoCase User-Agent "^NetMechanic" bad_bot
SetEnvIfNoCase User-Agent "^CherryPicker" bad_bot
SetEnvIfNoCase User-Agent "^EmailCollector" bad_bot
SetEnvIfNoCase User-Agent "^EmailSiphon" bad_bot
SetEnvIfNoCase User-Agent "^WebBandit" bad_bot
SetEnvIfNoCase User-Agent "^EmailWolf" bad_bot
SetEnvIfNoCase User-Agent "^ExtractorPro" bad_bot
SetEnvIfNoCase User-Agent "^CopyRightCheck" bad_bot
SetEnvIfNoCase User-Agent "^Crescent" bad_bot
SetEnvIfNoCase User-Agent "^Wget" bad_bot
SetEnvIfNoCase User-Agent "^SiteSnagger" bad_bot
SetEnvIfNoCase User-Agent "^ProWebWalker" bad_bot
SetEnvIfNoCase User-Agent "^CheeseBot" bad_bot
SetEnvIfNoCase User-Agent "^Teleport" bad_bot
SetEnvIfNoCase User-Agent "^TeleportPro" bad_bot
SetEnvIfNoCase User-Agent "^MIIxpc" bad_bot
SetEnvIfNoCase User-Agent "^Telesoft" bad_bot
SetEnvIfNoCase User-Agent "^Website Quester" bad_bot
SetEnvIfNoCase User-Agent "^WebZip" bad_bot
SetEnvIfNoCase User-Agent "^moget/2.1" bad_bot
SetEnvIfNoCase User-Agent "^WebZip/4.0" bad_bot
SetEnvIfNoCase User-Agent "^WebSauger" bad_bot
SetEnvIfNoCase User-Agent "^WebCopier" bad_bot
SetEnvIfNoCase User-Agent "^NetAnts" bad_bot
SetEnvIfNoCase User-Agent "^Mister PiX" bad_bot
SetEnvIfNoCase User-Agent "^WebAuto" bad_bot
SetEnvIfNoCase User-Agent "^TheNomad" bad_bot
SetEnvIfNoCase User-Agent "^WWW-Collector-E" bad_bot
SetEnvIfNoCase User-Agent "^RMA" bad_bot
SetEnvIfNoCase User-Agent "^libWeb/clsHTTP" bad_bot
SetEnvIfNoCase User-Agent "^asterias" bad_bot
SetEnvIfNoCase User-Agent "^httplib" bad_bot
SetEnvIfNoCase User-Agent "^turingos" bad_bot
SetEnvIfNoCase User-Agent "^spanner" bad_bot
SetEnvIfNoCase User-Agent "^InfoNaviRobot" bad_bot
SetEnvIfNoCase User-Agent "^Harvest/1.5" bad_bot
SetEnvIfNoCase User-Agent "Bullseye/1.0" bad_bot
SetEnvIfNoCase User-Agent "^Mozilla/4.0 (compatible; BullsEye; Windows 95)" bad_bot
SetEnvIfNoCase User-Agent "^Crescent Internet ToolPak HTTP OLE Control v.1.0" bad_bot
SetEnvIfNoCase User-Agent "^CherryPickerSE/1.0" bad_bot
SetEnvIfNoCase User-Agent "^CherryPicker /1.0" bad_bot
SetEnvIfNoCase User-Agent "^WebBandit/3.50" bad_bot
SetEnvIfNoCase User-Agent "^NICErsPRO" bad_bot
SetEnvIfNoCase User-Agent "^Microsoft URL Control – 5.01.4511" bad_bot
SetEnvIfNoCase User-Agent "^DittoSpyder" bad_bot
SetEnvIfNoCase User-Agent "^Foobot" bad_bot
SetEnvIfNoCase User-Agent "^WebmasterWorldForumBot" bad_bot
SetEnvIfNoCase User-Agent "^SpankBot" bad_bot
SetEnvIfNoCase User-Agent "^BotALot" bad_bot
SetEnvIfNoCase User-Agent "^lwp-trivial/1.34" bad_bot
SetEnvIfNoCase User-Agent "^lwp-trivial" bad_bot
SetEnvIfNoCase User-Agent "^Wget/1.6" bad_bot
SetEnvIfNoCase User-Agent "^BunnySlippers" bad_bot
SetEnvIfNoCase User-Agent "^Microsoft URL Control – 6.00.8169" bad_bot
SetEnvIfNoCase User-Agent "^URLy Warning" bad_bot
SetEnvIfNoCase User-Agent "^Wget/1.5.3" bad_bot
SetEnvIfNoCase User-Agent "^LinkWalker" bad_bot
SetEnvIfNoCase User-Agent "^cosmos" bad_bot
SetEnvIfNoCase User-Agent "^moget" bad_bot
SetEnvIfNoCase User-Agent "^hloader" bad_bot
SetEnvIfNoCase User-Agent "^humanlinks" bad_bot
SetEnvIfNoCase User-Agent "^LinkextractorPro" bad_bot
SetEnvIfNoCase User-Agent "^Offline Explorer" bad_bot
SetEnvIfNoCase User-Agent "^Mata Hari" bad_bot
SetEnvIfNoCase User-Agent "^LexiBot" bad_bot
SetEnvIfNoCase User-Agent "^Web Image Collector" bad_bot
SetEnvIfNoCase User-Agent "^The Intraformant" bad_bot
SetEnvIfNoCase User-Agent "^True_Robot/1.0" bad_bot
SetEnvIfNoCase User-Agent "^True_Robot" bad_bot
SetEnvIfNoCase User-Agent "^BlowFish/1.0" bad_bot
SetEnvIfNoCase User-Agent "^JennyBot" bad_bot
SetEnvIfNoCase User-Agent "^MIIxpc/4.2" bad_bot
SetEnvIfNoCase User-Agent "^BuiltBotTough" bad_bot
SetEnvIfNoCase User-Agent "^ProPowerBot/2.14" bad_bot
SetEnvIfNoCase User-Agent "^BackDoorBot/1.0" bad_bot
SetEnvIfNoCase User-Agent "^toCrawl/UrlDispatcher" bad_bot
SetEnvIfNoCase User-Agent "^WebEnhancer" bad_bot
SetEnvIfNoCase User-Agent "^TightTwatBot" bad_bot
SetEnvIfNoCase User-Agent "^suzuran" bad_bot
SetEnvIfNoCase User-Agent "^VCI WebViewer VCI WebViewer Win32" bad_bot
SetEnvIfNoCase User-Agent "^VCI" bad_bot
SetEnvIfNoCase User-Agent "^Szukacz/1.4" bad_bot
SetEnvIfNoCase User-Agent "^QueryN Metasearch" bad_bot
SetEnvIfNoCase User-Agent "^Openfind data gathere" bad_bot
SetEnvIfNoCase User-Agent "^Openfind" bad_bot
SetEnvIfNoCase User-Agent "^Xenu’s Link Sleuth 1.1c" bad_bot
SetEnvIfNoCase User-Agent "^Xenu’s" bad_bot
SetEnvIfNoCase User-Agent "^Zeus" bad_bot
SetEnvIfNoCase User-Agent "^RepoMonkey Bait & Tackle/v1.01" bad_bot
SetEnvIfNoCase User-Agent "^RepoMonkey" bad_bot
SetEnvIfNoCase User-Agent "^Zeus 32297 Webster Pro V2.9 Win32" bad_bot
SetEnvIfNoCase User-Agent "^Webster Pro" bad_bot
SetEnvIfNoCase User-Agent "^EroCrawler" bad_bot
SetEnvIfNoCase User-Agent "^LinkScan/8.1a Unix" bad_bot
SetEnvIfNoCase User-Agent "^Keyword Density/0.9" bad_bot
SetEnvIfNoCase User-Agent "^Kenjin Spider" bad_bot
SetEnvIfNoCase User-Agent "^Cegbfeieh" bad_bot

 

<Limit GET POST>
order allow,deny
allow from all
Deny from env=bad_bot
</Limit>

 


Above rules are Bad bots prohibition rules have RewriteEngine On directive included however for many websites this directive is enabled directly into VirtualHost section for domain/s, if that is your case you might also remove RewriteEngine on from .htaccess and still the prohibition rules of bad bots should continue to work
Above rules are also perfectly suitable wordpress based websites / blogs in case you need to filter out obstructive spiders even though the rules would work on any website domain with mod_rewrite enabled.

Once you have implemented above rules, you will not need to restart Apache, as .htaccess will be read dynamically by each client request to Webserver

2. Testing .htaccess Bad Bots Filtering Works as Expected


In order to test the new Bad Bot filtering configuration is working properly, you have a manual and more complicated way with lynx (text browser), assuming you have shell access to a Linux / BSD / *Nix computer, or you have your own *NIX server / desktop computer running
 

Here is how:
 

 

lynx -useragent="Mozilla/5.0 (compatible; MegaIndex.ru/2.0; +http://megaindex.com/crawler)" -head -dump http://www.your-website-filtering-bad-bots.com/

 

 

Note that lynx will provide a warning such as:

Warning: User-Agent string does not contain "Lynx" or "L_y_n_x"!

Just ignore it and press enter to continue.

Two other use cases with lynx, that I historically used heavily is to pretent with Lynx, you're GoogleBot in order to see how does Google actually see your website?
 

  • Pretend with Lynx You're GoogleBot

 

lynx -useragent="Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)" -head -dump http://www.your-domain.com/

 

 

  • How to Pretend with Lynx Browser You are GoogleBot-Mobile

 

lynx -useragent="Mozilla/5.0 (iPhone; U; CPU iPhone OS 4_1 like Mac OS X; en-us) AppleWebKit/532.9 (KHTML, like Gecko) Version/4.0.5 Mobile/8B117 Safari/6531.22.7 (compatible; Googlebot-Mobile/2.1; +http://www.google.com/bot.html)" -head -dump http://www.your-domain.com/

 


Or for the lazy ones that doesn't have Linux / *Nix at disposal you can use WannaBrowser website

Wannabrowseris a web based browser emulator which gives you the ability to change the User-Agent on each website req1uest, so just set your UserAgent to any bot browser that we just filtered for example set User-Agent to CheeseBot

The .htaccess rule earier added once detecting your browser client is coming in with the prohibit browser agent will immediately filter out and you'll be unable to access the website with a message like:
 

HTTP/1.1 403 Forbidden

 

Just as I've talked a lot about Index Bots, I think it is worthy to also mention three great websites that can give you a lot of Up to Date information on exact Spiders returned user-agent, common known Bot traits as well as a a current updated list with the Bad Bots etc.

Bot and Browser Resources information user-agents, bad-bots and odd Crawlers and Bots specifics

1. botreports.com
2. user-agents.org
3. useragentapi.com

 

An updated list with robots user-agents (crawler-user-agents) is also available in github here regularly updated by Caia Almeido

There are also a third party plugin (modules) available for Website Platforms like WordPress / Joomla / Typo3 etc.

Besides the listed on these websites as well as the known Bad and Good Bots, there are perhaps a hundred of others that might end up crawling your webdsite that might or might not need  to be filtered, therefore before proceeding with any filtering steps, it is generally a good idea to monitor your  HTTPD access.log / error.log, as if you happen to somehow mistakenly filter the wrong bot this might be a reason for Website Indexing Problems.

Hope this article give you some valueable information. Enjoy ! 🙂

 

Check how webpage looks with Internet Explorer on Linux and FreeBSD with Mozilla Firefox (Netrenderer Firefox plugin)

Thursday, November 1st, 2012

Simulate Internet Explorer in screenshots on GNU / Linux and FreeBSD using Netrenderer in Firefox - Internet Explorer testing tool for web developers on Linux and FreeBSD

I'm not full time web developer. But sometimes, I develop websites too or just had to do some website testing.
I'm using GNU / Linux and BSD as main server and desktop platforms for many years already and hence I don't have regular access to Windows OS and respectively Internet Explorer. In that manner of thoughts it is very useful to have a way to check if a certain website I create displays fine on Internet Explorer 6,7,8 too.

Usually whether I need to test if website displays properly its elements in Internet Explorer I do use the infamous  http://ipinfo.info/netrenderer/index.php – I guess it is almost impossible anyone is developing websites on Linux and don't know it :). Fortunately while I was googling to remind myself about the exact link location to netrenderer, I've stumbled upon Mozilla Firefox add-on extension which does precisely what ipinfo.info/netrenderer/ website does – i.e. renders a website with HTML Web Engine compatible   to most Internet Explorer versions and creating screenshots on how a website would look under Internet Explorer. Of course the plugin is not a panace and since it only makes screenshots whether there are problems with interactivity (Javascript AJAX) of a website on IE will the plugin will be of zero use. However in general it is good to know if at least the website elements are ordered fine.
After the plugin is added in the usual way as any other plugin in FF, you can start using it with keyboard shortcuts:

Ctrl+Shift+F5/F6/F7/F8 – respectively renders the page in IE5.5, IE 6, IE 7 / IE 8 Beta 2

Pressing CTRL + Shift + FX, makes the IE screenshot of site using http://ipinfo.info/netrenderer/

I'm currently running latest Firefox version 16.0.2 and here plugin works, fine I guess on most FF releases not older than few years it should work fine too.

Below is description of the plugin, as taken from plugin website:

IE NetRendered Add-on Description

Adds buttons, tools menu and contextual menu entries to get a screenshot of the current page with IE NetRenderer.

Keyboard shortcuts are also available: Ctrl+Shift+F5/F6/F7/F8 to render the page in IE5.5/6/7/8 Beta 2 (Cmd+Shift+F* on Mac).

Really useful for webmasters which are not using Windows!

You can also access the IE NetRenderer service here: http://ipinfo.info/netrenderer/index.php

Please note that the extension developper is not affiliated with GEOTEK, providing the IE NetRenderer service. You can visit his website here: http://nicopensource.free.fr/

 

 

 

 

 

 

 

 

 

How to find how much power (electricity) consumption a server or PC has?

Friday, November 2nd, 2012

Kill-A-Watt track system power electricty consumption on GNU / Linux servers and FreeBSD
A friend of mine today ask me if I have clue if it is possible to track his home computer Consumption with some piece of Software?

The question is quite interesting, since I run a home server with Linux and it would have been nice if I can exactly track how much electricity per month it  consumes

Now knowing, the answer I first checked online for some kind of software and all I can find something that does something similar but all can find is powertop.

Though powertop is nice Linux tool to keep an eye which program on PC consumes most from overall consumed electricity and order the programs and modules based on electricity consumption it is not providing information on overall electricity consumption.

As the topic seem to be some interesting, I've decided to ask in irc.freenode.net #deiban
Here is a paste from  irssi channel log:

17:21 < hipodilski> hi any idea, how can I find how much electricity a server conmuses per month
17:21 < hipodilski> is there some some kind of software
17:21 -!- digdilem [~digdilem@plague.digdilem.org] has joined #debian
17:22 < babilen> hipodilski: I would recommend an electricity meter rather than software
17:22 -!- tommy_e [~tommy@81.27.221.202] has quit [Ping timeout: 260 seconds]
17:22 < jelly-home> watt meters ftw
17:22 -!- msx [~msx@190.194.114.10] has joined #debian
17:22 -!- blackshirt [~najwa@103.3.223.5] has left #debian []
17:23 < hipodilski> yes but i don't have electricity metter, if there is software it would be interesting to try it
17:23 -!- badiane [~gdurand@D8FF67fa.cst.lightpath.net] has quit [Remote host closed the connection]
17:23 < xand> hipodilski: no, you need a hardware device.
17:23 < jelly-home> now everything can be solved in software, hipodilski
17:23 < jelly-home> not*
17:23 < jelly-home> dammit
17:23 < xand> unless you have a very fancy PSU, software can't find that out
17:23 < babilen> jelly-home: hehe, nice typo !
17:23 < vacuous> hipodilski yes
17:24 < HelloShitty> nsadmin, are you out of ideas for me?
17:24 < vacuous> there's various devices that do it
17:24 -!- firecode [~irc@unaffiliated/firecode] has joined #debian
17:24 < vacuous> you can either get a killawat which are highly innacurate but it might give you a clue
17:24 < vacuous> and they're very cheap too
17:25 < vacuous> you can get a device which measures your entire houses electric, then you just turn off all the appliances and run the
                 server only
17:25 -!- trysten [~trysten@37-251-103-145.FTTH.ispfabriek.nl] has quit [Quit: be back]
17:25  * babilen likes that approach
17:25 < babilen> But this is getting a bit too off-topic. Maybe hipodilski wants to take it to #debian-offtopic
17:25 < vacuous> or you can keep all fridges on, check what the reading is and then negate that from the total
17:25 < hipodilski> yes thanks 🙂
 

The answer makes it clear right of time of writing this post there is no software for Linux or BSD that keeps track electricity consumption daily or monthly

I've googled to see what is Kill-A-Watt hardware? and found fuzzy named device Kill-A-Watt for sale on ThinkGeek's website for the not so expensive 24.99$

To use Kill-A-Watt device is to be connected inside the power plug and then PC or Server has to be plugged into  Kill-A-Watt dev. I've red also (while researching) many Intelligent UPS devs has support for keeping log of discharged energy, so just buying a good UPS with web administrator or even a cheap one providing statistical information of UPS use via serial port should be another alternative to track ur server consumption.