Posts Tagged ‘substitute’

Block Web server over loading Bad Crawler Bots and Search Engine Spiders with .htaccess rules

Monday, September 18th, 2017

howto-block-webserver-overloading-bad-crawler-bots-spiders-with-htaccess-modrewrite-rules-file

In last post, I've talked about the problem of Search Index Crawler Robots aggressively crawling websites and how to stop them (the article is here) explaning how to raise delays between Bot URL requests to website and how to completely probhit some bots from crawling with robots.txt.

As explained in article the consequence of too many badly written or agressive behaviour Spider is the "server stoning" and therefore degraded Web Server performance as a cause or even a short time Denial of Service Attack, depending on how well was the initial Server Scaling done.

The bots we want to filter are not to be confused with the legitimate bots, that drives real traffic to your website, just for information

 The 10 Most Popular WebCrawlers Bots as of time of writting are:
 

1. GoogleBot (The Google Crawler bots, funnily bots become less active on Saturday and Sundays :))

2. BingBot (Bing.com Crawler bots)

3. SlurpBot (also famous as Yahoo! Slurp)

4. DuckDuckBot (The dutch search engine duckduckgo.com crawler bots)

5. Baiduspider (The Chineese most famous search engine used as a substitute of Google in China)

6. YandexBot (Russian Yandex Search engine crawler bots used in Russia as a substitute for Google )

7. Sogou Spider (leading Chineese Search Engine launched in 2004)

8. Exabot (A French Search Engine, launched in 2000, crawler for ExaLead Search Engine)

9. FaceBot (Facebook External hit, this crawler is crawling a certain webpage only once the user shares or paste link with video, music, blog whatever  in chat to another user)

10. Alexa Crawler (la_archiver is a web crawler for Amazon's Alexa Internet Rankings, Alexa is a great site to evaluate the approximate page popularity on the internet, Alexa SiteInfo page has historically been the Swift Army knife for anyone wanting to quickly evaluate a webpage approx. ranking while compared to other pages)

Above legitimate bots are known to follow most if not all of W3C – World Wide Web Consorium (W3.Org) standards and therefore, they respect the content commands for allowance or restrictions on a single site as given from robots.txt but unfortunately many of the so called Bad-Bots or Mirroring scripts that are burning your Web Server CPU and Memory mentioned in previous article are either not following /robots.txt prescriptions completely or partially.

Hence with the robots.txt unrespective bots, the case the only way to get rid of most of the webspiders that are just loading your bandwidth and server hardware is to filter / block them is by using Apache's mod_rewrite through

 

.htaccess


file

Create if not existing in the DocumentRoot of your website .htaccess file with whatever text editor, or create it your windows / mac os desktop and transfer via FTP / SecureFTP to server.

I prefer to do it directly on server with vim (text editor)

 

 

vim /var/www/sites/your-domain.com/.htaccess

 

RewriteEngine On

IndexIgnore .htaccess */.??* *~ *# */HEADER* */README* */_vti*

SetEnvIfNoCase User-Agent "^Black Hole” bad_bot
SetEnvIfNoCase User-Agent "^Titan bad_bot
SetEnvIfNoCase User-Agent "^WebStripper" bad_bot
SetEnvIfNoCase User-Agent "^NetMechanic" bad_bot
SetEnvIfNoCase User-Agent "^CherryPicker" bad_bot
SetEnvIfNoCase User-Agent "^EmailCollector" bad_bot
SetEnvIfNoCase User-Agent "^EmailSiphon" bad_bot
SetEnvIfNoCase User-Agent "^WebBandit" bad_bot
SetEnvIfNoCase User-Agent "^EmailWolf" bad_bot
SetEnvIfNoCase User-Agent "^ExtractorPro" bad_bot
SetEnvIfNoCase User-Agent "^CopyRightCheck" bad_bot
SetEnvIfNoCase User-Agent "^Crescent" bad_bot
SetEnvIfNoCase User-Agent "^Wget" bad_bot
SetEnvIfNoCase User-Agent "^SiteSnagger" bad_bot
SetEnvIfNoCase User-Agent "^ProWebWalker" bad_bot
SetEnvIfNoCase User-Agent "^CheeseBot" bad_bot
SetEnvIfNoCase User-Agent "^Teleport" bad_bot
SetEnvIfNoCase User-Agent "^TeleportPro" bad_bot
SetEnvIfNoCase User-Agent "^MIIxpc" bad_bot
SetEnvIfNoCase User-Agent "^Telesoft" bad_bot
SetEnvIfNoCase User-Agent "^Website Quester" bad_bot
SetEnvIfNoCase User-Agent "^WebZip" bad_bot
SetEnvIfNoCase User-Agent "^moget/2.1" bad_bot
SetEnvIfNoCase User-Agent "^WebZip/4.0" bad_bot
SetEnvIfNoCase User-Agent "^WebSauger" bad_bot
SetEnvIfNoCase User-Agent "^WebCopier" bad_bot
SetEnvIfNoCase User-Agent "^NetAnts" bad_bot
SetEnvIfNoCase User-Agent "^Mister PiX" bad_bot
SetEnvIfNoCase User-Agent "^WebAuto" bad_bot
SetEnvIfNoCase User-Agent "^TheNomad" bad_bot
SetEnvIfNoCase User-Agent "^WWW-Collector-E" bad_bot
SetEnvIfNoCase User-Agent "^RMA" bad_bot
SetEnvIfNoCase User-Agent "^libWeb/clsHTTP" bad_bot
SetEnvIfNoCase User-Agent "^asterias" bad_bot
SetEnvIfNoCase User-Agent "^httplib" bad_bot
SetEnvIfNoCase User-Agent "^turingos" bad_bot
SetEnvIfNoCase User-Agent "^spanner" bad_bot
SetEnvIfNoCase User-Agent "^InfoNaviRobot" bad_bot
SetEnvIfNoCase User-Agent "^Harvest/1.5" bad_bot
SetEnvIfNoCase User-Agent "Bullseye/1.0" bad_bot
SetEnvIfNoCase User-Agent "^Mozilla/4.0 (compatible; BullsEye; Windows 95)" bad_bot
SetEnvIfNoCase User-Agent "^Crescent Internet ToolPak HTTP OLE Control v.1.0" bad_bot
SetEnvIfNoCase User-Agent "^CherryPickerSE/1.0" bad_bot
SetEnvIfNoCase User-Agent "^CherryPicker /1.0" bad_bot
SetEnvIfNoCase User-Agent "^WebBandit/3.50" bad_bot
SetEnvIfNoCase User-Agent "^NICErsPRO" bad_bot
SetEnvIfNoCase User-Agent "^Microsoft URL Control – 5.01.4511" bad_bot
SetEnvIfNoCase User-Agent "^DittoSpyder" bad_bot
SetEnvIfNoCase User-Agent "^Foobot" bad_bot
SetEnvIfNoCase User-Agent "^WebmasterWorldForumBot" bad_bot
SetEnvIfNoCase User-Agent "^SpankBot" bad_bot
SetEnvIfNoCase User-Agent "^BotALot" bad_bot
SetEnvIfNoCase User-Agent "^lwp-trivial/1.34" bad_bot
SetEnvIfNoCase User-Agent "^lwp-trivial" bad_bot
SetEnvIfNoCase User-Agent "^Wget/1.6" bad_bot
SetEnvIfNoCase User-Agent "^BunnySlippers" bad_bot
SetEnvIfNoCase User-Agent "^Microsoft URL Control – 6.00.8169" bad_bot
SetEnvIfNoCase User-Agent "^URLy Warning" bad_bot
SetEnvIfNoCase User-Agent "^Wget/1.5.3" bad_bot
SetEnvIfNoCase User-Agent "^LinkWalker" bad_bot
SetEnvIfNoCase User-Agent "^cosmos" bad_bot
SetEnvIfNoCase User-Agent "^moget" bad_bot
SetEnvIfNoCase User-Agent "^hloader" bad_bot
SetEnvIfNoCase User-Agent "^humanlinks" bad_bot
SetEnvIfNoCase User-Agent "^LinkextractorPro" bad_bot
SetEnvIfNoCase User-Agent "^Offline Explorer" bad_bot
SetEnvIfNoCase User-Agent "^Mata Hari" bad_bot
SetEnvIfNoCase User-Agent "^LexiBot" bad_bot
SetEnvIfNoCase User-Agent "^Web Image Collector" bad_bot
SetEnvIfNoCase User-Agent "^The Intraformant" bad_bot
SetEnvIfNoCase User-Agent "^True_Robot/1.0" bad_bot
SetEnvIfNoCase User-Agent "^True_Robot" bad_bot
SetEnvIfNoCase User-Agent "^BlowFish/1.0" bad_bot
SetEnvIfNoCase User-Agent "^JennyBot" bad_bot
SetEnvIfNoCase User-Agent "^MIIxpc/4.2" bad_bot
SetEnvIfNoCase User-Agent "^BuiltBotTough" bad_bot
SetEnvIfNoCase User-Agent "^ProPowerBot/2.14" bad_bot
SetEnvIfNoCase User-Agent "^BackDoorBot/1.0" bad_bot
SetEnvIfNoCase User-Agent "^toCrawl/UrlDispatcher" bad_bot
SetEnvIfNoCase User-Agent "^WebEnhancer" bad_bot
SetEnvIfNoCase User-Agent "^TightTwatBot" bad_bot
SetEnvIfNoCase User-Agent "^suzuran" bad_bot
SetEnvIfNoCase User-Agent "^VCI WebViewer VCI WebViewer Win32" bad_bot
SetEnvIfNoCase User-Agent "^VCI" bad_bot
SetEnvIfNoCase User-Agent "^Szukacz/1.4" bad_bot
SetEnvIfNoCase User-Agent "^QueryN Metasearch" bad_bot
SetEnvIfNoCase User-Agent "^Openfind data gathere" bad_bot
SetEnvIfNoCase User-Agent "^Openfind" bad_bot
SetEnvIfNoCase User-Agent "^Xenu’s Link Sleuth 1.1c" bad_bot
SetEnvIfNoCase User-Agent "^Xenu’s" bad_bot
SetEnvIfNoCase User-Agent "^Zeus" bad_bot
SetEnvIfNoCase User-Agent "^RepoMonkey Bait & Tackle/v1.01" bad_bot
SetEnvIfNoCase User-Agent "^RepoMonkey" bad_bot
SetEnvIfNoCase User-Agent "^Zeus 32297 Webster Pro V2.9 Win32" bad_bot
SetEnvIfNoCase User-Agent "^Webster Pro" bad_bot
SetEnvIfNoCase User-Agent "^EroCrawler" bad_bot
SetEnvIfNoCase User-Agent "^LinkScan/8.1a Unix" bad_bot
SetEnvIfNoCase User-Agent "^Keyword Density/0.9" bad_bot
SetEnvIfNoCase User-Agent "^Kenjin Spider" bad_bot
SetEnvIfNoCase User-Agent "^Cegbfeieh" bad_bot

 

<Limit GET POST>
order allow,deny
allow from all
Deny from env=bad_bot
</Limit>

 


Above rules are Bad bots prohibition rules have RewriteEngine On directive included however for many websites this directive is enabled directly into VirtualHost section for domain/s, if that is your case you might also remove RewriteEngine on from .htaccess and still the prohibition rules of bad bots should continue to work
Above rules are also perfectly suitable wordpress based websites / blogs in case you need to filter out obstructive spiders even though the rules would work on any website domain with mod_rewrite enabled.

Once you have implemented above rules, you will not need to restart Apache, as .htaccess will be read dynamically by each client request to Webserver

2. Testing .htaccess Bad Bots Filtering Works as Expected


In order to test the new Bad Bot filtering configuration is working properly, you have a manual and more complicated way with lynx (text browser), assuming you have shell access to a Linux / BSD / *Nix computer, or you have your own *NIX server / desktop computer running
 

Here is how:
 

 

lynx -useragent="Mozilla/5.0 (compatible; MegaIndex.ru/2.0; +http://megaindex.com/crawler)" -head -dump http://www.your-website-filtering-bad-bots.com/

 

 

Note that lynx will provide a warning such as:

Warning: User-Agent string does not contain "Lynx" or "L_y_n_x"!

Just ignore it and press enter to continue.

Two other use cases with lynx, that I historically used heavily is to pretent with Lynx, you're GoogleBot in order to see how does Google actually see your website?
 

  • Pretend with Lynx You're GoogleBot

 

lynx -useragent="Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)" -head -dump http://www.your-domain.com/

 

 

  • How to Pretend with Lynx Browser You are GoogleBot-Mobile

 

lynx -useragent="Mozilla/5.0 (iPhone; U; CPU iPhone OS 4_1 like Mac OS X; en-us) AppleWebKit/532.9 (KHTML, like Gecko) Version/4.0.5 Mobile/8B117 Safari/6531.22.7 (compatible; Googlebot-Mobile/2.1; +http://www.google.com/bot.html)" -head -dump http://www.your-domain.com/

 


Or for the lazy ones that doesn't have Linux / *Nix at disposal you can use WannaBrowser website

Wannabrowseris a web based browser emulator which gives you the ability to change the User-Agent on each website req1uest, so just set your UserAgent to any bot browser that we just filtered for example set User-Agent to CheeseBot

The .htaccess rule earier added once detecting your browser client is coming in with the prohibit browser agent will immediately filter out and you'll be unable to access the website with a message like:
 

HTTP/1.1 403 Forbidden

 

Just as I've talked a lot about Index Bots, I think it is worthy to also mention three great websites that can give you a lot of Up to Date information on exact Spiders returned user-agent, common known Bot traits as well as a a current updated list with the Bad Bots etc.

Bot and Browser Resources information user-agents, bad-bots and odd Crawlers and Bots specifics

1. botreports.com
2. user-agents.org
3. useragentapi.com

 

An updated list with robots user-agents (crawler-user-agents) is also available in github here regularly updated by Caia Almeido

There are also a third party plugin (modules) available for Website Platforms like WordPress / Joomla / Typo3 etc.

Besides the listed on these websites as well as the known Bad and Good Bots, there are perhaps a hundred of others that might end up crawling your webdsite that might or might not need  to be filtered, therefore before proceeding with any filtering steps, it is generally a good idea to monitor your  HTTPD access.log / error.log, as if you happen to somehow mistakenly filter the wrong bot this might be a reason for Website Indexing Problems.

Hope this article give you some valueable information. Enjoy ! 🙂

 

My struggles to find good substitute for old QWERTY Nokia 9300 Communicator and The Death of QWERTY Smart phones

Monday, January 27th, 2014

Nowadays having a a touch screen mobile phone has become like a standard. I'm not such a big fan of Touch screen technology, thus I've been fighting with the idea to own a touch screen phone for a year or so. Just till recently I happily lived with my old Nokia 9300i with a physical QWERTY keyboard for already 4 years.

Unfortunately lately while talking with my Nokia I started getting frequent voice interruptions missing words in my phone call conversations and need to ask person I'm talking to, to repeat his words / sentence in order to understand what is communicated .. I'm economic person and therefore don't like bying anything new if it is not absolutely necessary so I opened the phone and clean it hoping that this will solve the conversation issues but with no luck. With this half-usable mobile my only option left was to buy a new mobile phone.

I'm not very rigorious on what a mobile phone should be and I'm very much minimalist by heart so I was thinking of bying new cheapest available Nokia phone on the market and solve my "issue" quick and efficient,  only problem was  I'm quite used already of using my handy QWERTY phone as a note taking device thus I preferred to not buy a keypad mobile phone but get again a smartphone with physical QWERTY.

I have consulted with some friends who are more knowledgable on what's latest on mobile phone market with a question what will be a good substitute for my Nokia Communicator 9300i and heard comments like:
'IPhone is the most functional and superior in interface', while some friends and colleagues adviced me:
'Choose an Android based phone as Android is Linux based and gives more freedom to the user as well as has more free applications to install'

I appreaciated my friends help but I didn't like the idea to buy a Smartphone with a touchscreen display – virtual keyboard is not so confortable as having a physical one and besides this is a very slow interface compared to physical keys. Thus initially strongly rejected the idea of bying a phone without a physical keyboard. After some weeks of pondering and checking in the market – in 3 Major mobile operators shops in Bulgaria MTel, Globul and VIVACOM and HANDY store. I've find out currently on the market there is no good price / quality and functionality ratio (qwerty keyboard mobile) available. My options were limited to either by a Nokia Asha 210 or some variance or a Blackberry mobile.

nokia_asha_210_mobile_phone

In first glimpse I liked the Nokia ASHA 210 – QWERTY powered mobile  but after noticing the blue Facebook "F" button got quickly jolted.
The sales lady offered me a couple of other Nokias with Qwerty keyboard as well as a Blackberry 9320 Curve.

blackberry_curve_smart_mobile_phone
After a quick test of all QWERTY mobiles, found the intertface on both is so inferior to IPhone's IOS and Android based phones.
I asked my HP workmates for advice of a good QWERTY bundled mobile phone with Android and was referred to Motorolla Droid 3 – which seems to among the only options on the market for mobile Phone which have both Android Operating System and a Physical QWERTY keyboardMotorolla Droid 3 seemed to be exactly the mobile I was looking for but unfortunately it is not available in Mobile phones stores in Bulgaria and only in bulgaria is only offered for sale as a second hand and I had to buy it over the Internet (I prefer not to buy on the Internet). Even if I bought it as second hand  DROID 3's price is too high for my budget – 250 EUR!

Motorola-Droid-3-mobile-nokia-9300i-substitute

I wanted to buy economic phone and same time to have a good balance between price and phone modernity, same time don't tie myself with mobile operator yearly tax plan thus decided to pay my whole mobile price in cache (no credits, no binding 2 / 3 year conversation plans).

After evaluating the options on Market I stopped on two mobiles identical by price 150 EUR I could choose between Samsung Galaxy Trend Lite or ZTE Blade 3. The sales lady adviced me its better to get the ZTE Blade 3 than Samsung Galaxy Trend Lite (S7350) because ZTE has better Camera (5 Mpixels), a better Display and has much less hardware issues than Samsung Galaxy Lite.

Samsung-Galaxy-Trend-Lite-S7390-smart-mobile-phone

 

Finally I bought the ZTE Blade 3 and nowdays I'm trying to get used to it and to be honest even with a week passed I still can't get used to the Virtual Keyboard

Android interface is quite shiny but a little bit chaotic if compared to design use interface I've tested on IPhones. Android OS seems to behave very weird at times but in general is quite easy to use. Managing / installing / Removing applications from Google AppStore is done by only 2 clicks. My major concern on Android is its highly addictive. I've catch myself, since last week I spend much more time using my mobile than before with my Nokia Communicator …
ZTE-Blade-III-Black-smart-phone

To conclude it I would say living with a smartphone has its advantagous (you can easily check weather prognosis / news) and do a number of things with it, but it is addictive .. obviously its easy to  become an Android addict and spend your free time on useless stuff like installing / testing new apps and playing with phone. Having a smartphone just like I priorly suspected is a big time eater and it seems my hypothesis  that its better to live without a smart phone is true. But who knows, perhaps its just a moment addictiveness just like with any new thing posession – time will show. In meantime I believe my ZTE Blade III – purchase was a good deal as it gives me opportunity to explore Android OS. I'll stop here with my ranting and excuse myself if the article was too boring …Please drop me a comment with mobile types and names who had QWERTY keyboard and a modern OS. Very sadly it seems the QWERTY hardware keyboard mobiles will soon be dead and gone …

Installing XMMS on Debian Squeeze from a Package / Installing XMMS on Debian – the debian way

Tuesday, July 17th, 2012

installing xmms on debian squeeze linux playing free software song green skin screenshot

I use Debian Linux for my desktop for quite some time; Even though there are plenty of MP3 / CD players around in Debian, I’m used to the good old XMMS, hence I often prefer to use XMMS to play my music instead of newer players like RhythmBox or audacious.
Actually audacious is not bad substitute for XMMS and is by default part of Debian but to me it seems more buggy and tends to crash during playing some music formats more than xmms ….

As most people might know, XMMS is no longer supported in almost all modern Linux distributions, so anyone using Debian, Ubuntu or other deb derivative Linux would have to normally compile it from source.
Compiling from source is time consuming and I think often it doesn’t pay back the effort. Thanksfully, though not officially supported by Debian crew XMMS still can be installed using a deb xmms prebuilt package repository kindly provided by a hacker fellow knuta.

Using the pre-build deb packages, installing xmms on new Debian installs comes to:

debian:~# echo 'deb http://www.pvv.ntnu.no/~knuta/xmms/squeeze ./' >> /etc/apt/sources.list
debian:~# echo 'deb-src http://www.pvv.ntnu.no/~knuta/xmms/squeeze ./' >> /etc/apt/sources.list
debian:~# apt-get update && apt-get -y install xmms

There are also deb xmms built for Ubuntu, so Ubuntu users could install xmms using repositories:

deb http://www.pvv.ntnu.no/~knuta/xmms/karmic ./
deb-src http://www.pvv.ntnu.no/~knuta/xmms/karmic ./
That’s all now xmms is ready to use. Enjoy 🙂

Improve default picture viewing on Slackware Linux with XFCE as Desktop environment

Saturday, March 17th, 2012

Default XFce picture viewer on Slackware Linux is GIMP (GNU Image Manipulation Program). Though GIMP is great for picture editting, it is rather strange why Patrick Volkerding compiled XFCE to use GIMP as a default picture viewer? The downsides of GIMP being default picture viewing program for Slackware's XFCE are the same like Xubuntu's XFCE risterroro, you can't switch easily pictures back and forward with some keyboard keys (left, right arrow keys, backspace or space etc.). Besides that another disadvantage of using GIMP are;
a) picture opening time in GIMP loading is significantly higher if compared to a simple picture viewer program like Gnome's default, eye of the gnomeeog.

b) GIMP is more CPU intensive and puts high load on each picture opening

A default Slackware install comes with two good picture viewing programs substitute for GIMP:
 

  • Gwenview

    Gwenview on Slackware Linux picture screenshot XFCE

  •  
  • Geeqie
  • Geeqie Slackware Linux Screenshot XFCE

    Both of the programs support picture changing, so if you open a picture you can switch to the other ones in the same directory as the first opened one.
    I personally liked more Gwenview because it has more intutive picture switching controls. With it you can switch with keyboard keys space and backspace

    To change GIMP's default PNG, JPEG opening I had with mouse right button over a pic and in properties change, Open With: program.

    XFCE4 Slackware Linux picture file properties window

    If you're curious about the picture on on all screenshots, this is Church – Saint George (situated in the city center of Dobrich, Bulgaria).
    St. Georgi / St. George Church is built in 1842 and is the oldest Orthodox Church in Dobrich.
    In the Crimean War (1853-1856) the church was burned down and was restored to its present form in 1864.

    gpicview is another cool picture viewing program, I like. Unfortunately on Slackware, there is no prebuild package and the only option is either to convert it with alien from deb package or to download source and compile as usual with ./configure && make && make install .
    Downloading and compiling from source went just fine on Slackware Linux 13.37gpicview has more modern looking interface, than gwenview and geeqie. and is great for people who want to be in pace with desktop fashion 🙂

How to exclude files on copy (cp) on GNU / Linux / Linux copy and exclude files and directories (cp -r) exclusion

Saturday, March 3rd, 2012

I've recently had to make a copy of one /usr/local/nginx directory under /usr/local/nginx-bak, in order to have a working copy of nginx, just in case if during my nginx update to new version from source mess ups.

I did not check the size of /usr/local/nginx , so just run the usual:

nginx:~# cp -rpf /usr/local/nginx /usr/local/nginx-bak
...

Execution took more than 20 seconds, so I check the size and figured out /usr/local/nginx/logs has grown to 120 gigabytes.

I didn't wanted to extra load the production server with copying thousands of gigabytes so I asked myself if this is possible with normal Linux copy (cp) command?. I checked cp manual e.g. man cp, but there is no argument like –exclude or something.

Even though the cp command exclude feature is not implemented by default there are a couple of ways to copy a directory with exclusion of subdirectories of files on G / Linux.

Here are the 3 major ones:

1. Copy directory recursively and exclude sub-directories or files with GNU tar

Maybe the quickest way to copy and exclude directories is through a littke 'hack' with GNU tarnginx:~# mkdir /usr/local/nginx-new;
nginx:~# cd /usr/local/nginx#
nginx:/usr/local/nginx# tar cvf - \. --exclude=/usr/local/nginx/logs/* \
| (cd /usr/local/nginx-new; tar -xvf - )

Copying that way however is slow, in my case it fits me perfectly but for copying large chunks of data it is better not to use pipe and instead use regular tar operation + mv

# cd /source_directory
# tar cvf test.tar --exclude=dir_to_exclude/*\--exclude=dir_to_exclude1/* . \
# mv test.tar /destination_directory
# cd /destination# tar xvf test.tar

2. Copy folder recursively excluding some directories with rsync

P>eople who has experience with rsync , already know how invaluable this tool is. Rsync can completely be used as for substitute=de.a# rsync -av –exclude='path1/to/exclude' –exclude='path2/to/exclude' source destination

This example, can also be used as a solution to my copy nginx and exclude logs directory casus like so:

nginx:~# rsync -av --exclude='/usr/local/nginx/logs/' /usr/local/nginx/ /usr/local/nginx-new

As you can see for yourself, this is a way more readable for the tar, however it will not work on servers, where rsync is not installed and it is unusable if you have to do operations as a regular users on such for that case surely the GNU tar hack is more 'portable' across systems.
rsync has also Windows version and therefore, the same methodology should be working on MS Windows and good for batch scripting.
I've not tested it myself, yet as I've never used rsync on Windows, if someone has tried and it works pls drop me a short msg in comments.
3. Copy directory and exclude sub directories and files with find

Find in collaboration with cp can also be used to exclude certain directories while copying. Actually this method is better than the GNU tar hack and surely more efficient. For machines, where rsync is not installed it is just a perfect way to copy files from location to location, while excluding some directories, here is an example use of find and cp, for the above nginx case:

nginx:~# cd /usr/local/nginx
nginx:~# mkdir /usr/local/nginx
nginx:/usr/local/nginx# find . -type d \( ! -name logs \) -print -exec cp -rpf '{}' /usr/local/nginx-bak \;

This will find all directories inside /usr/local/nginx with find command print them on the screen, then execute recursive copy over each found directory and copy to /usr/local/nginx-bak

This example will work fine in the nginx case because /usr/local/nginx does not contain any files but only sub-directories. In other occwhere the directory does contain some files besides sub-directories the files had to also be copied e.g.:

# for i in $(ls -l | egrep -v '^d'); do\
cp -rpf $i /destination/directory

This will copy the files from source directory (for instance /usr/local/nginx/my_file.txt, /usr/local/nginx/my_file1.txt etc.), which doesn't belong to a subdirectory.

The cmd expression:

# ls -l | egrep -v '^d'

Lists only the files while excluding all the directories and in a for loop each of the files is copied to /destination/directory

If someone has better ideas, please share with me 🙂

How to mount ISO image files in Graphical Environment (GUI) on Ubuntu and Debian GNU/Linux

Saturday, January 14th, 2012

Mounting ISO files in Linux is easy with mount cmd, however remembering the exact command one has to issue is a hard task because mounting ISO files is not a common task.

Mounting ISO files directly by clicking on the ISO file is very nice, especially for lazy people uninitiated with the command line 😉

Besides that I'm sure many Windows users are curious if there is an equivallent program to DaemonTools for Linux / BSD*?

The answer to this question is YES!
There are two major programs which can be used as a DaemonTools substitute on Linux:

These are FuriousISOMount and AcetoneISO
AcetoneISO is more known and I've used it some long time ago and if I'm correct it used to be one of the first ISO Mount GUI programs for Linux. There is a project called GMount-ISO / (GMountISO) which of the time of writting this article seems to be dead (at least I couldn't find the source code).

Luckily FuriousISOMount and AcetoneISO are pretty easy to install and either one of the two is nowdays existing in most Linux distributions.
Probably the programs can also be easily run on BSD platform also quite easily using bsd linux emulation.
If someone has tried something to mount GUIs in Free/Net/OpenBSD, I'll be interesting to hear how?

1. Mount ISO files GUI in GNOME with Furius ISO Mount

FuriousISOMount is a simple Gtk+ interface to mount -t iso9660 -o loop command.

To start using the program on Debian / Ubuntu install with apt;

debian:~# apt-get install furiusisomount
The following extra packages will be installed:
fuseiso fuseiso9660 libumlib0
The following NEW packages will be installed:
furiusisomount fuseiso fuseiso9660 libumlib0

To access the program in GNOME after install use;

Applications -> Accessories -> Furious ISO Mount

Screenshot ISO Mount Tool Debian GNU/Linux Screenshot
 

When mounting it is important to choose Loop option to mount the iso instead of Fuse

After the program is installed to associate the (.iso) ISO files, to permanently be opened with furiusisomount roll over the .iso file and choose Open With -> Other Application -> (Use a custom command) -> furiusisomount

GNOME Open with menu Debian GNU / Linux

2. Mount ISO Files in KDE Graphical Environment with AcetoneISO

AcetoneISO is build on top of KDE's QT library and isway more feature rich than furiousisomount.
Installing AcetoneISO Ubuntu and Debian is done with:

debian:~# apt-get install acetoneiso
The following NEW packages will be installed:
acetoneiso gnupg-agent gnupg2 libksba8 pinentry-gtk2 pinentry-qt4
0 upgraded, 6 newly installed, 0 to remove and 35 not upgraded.
Need to get 3,963 kB of archives.
After this operation, 8,974 kB of additional disk space will be used.
...

Screenshot Furius ISO Mount Tool Debian GNU/Linux ScreenShot

AcetoneISO supports:
 

  • conversion between different ISO formats
  • burn images to disc
  • split ISO image volumes
  • encrypt images
  • extract password protected files

Complete list of the rich functionality AcetoneISO offers is to be found on http://www.acetoneteam.org/viewpage.php?page_id=6
To start the program via the GNOME menus use;

Applications -> Accessories -> Sound & Video -> AcetoneISO

I personally don't like AcetoneISO as I'm not a KDE user and I see the functionality this program offers as to rich and mostly unnecessery for the simple purpose of mounting an ISO.

3. Mount ISO image files using the mount command

If you're a console guy and still prefer mounting ISO with the mount command instead of using fancy gui stuff use:

# mount -t iso9660 -o loop /home/binary/someiso.iso /home/username/Iso_Directory_Name

 

Solve ALSA audio and mic issues on Lenovo Thinkpads on Debian and Ubuntu Linux

Wednesday, January 11th, 2012

Since I've blogged about my recent skype issues. I've played a lot with pulseaudio, alsa, alsa-oss to experimented a lot until I figured out why Skype was failing to properly delivery sound and record via my embedded laptop mic.

Anyways, while researching on the cause of my Thinkpad r61 mic issues, I've red a bunch of blog posts by people experiencing microphone oddities with Lenovo Thinkpads

Throughout the search I come across one very good article, which explained that in many cases the Thinkpad sound problems are caused by the snd-hda-intel alsa kernel module. snd-hda-intel fails to automatically set proper sb model type argument during Linux install when the soundcard is initialized with some argument like options snd-hda-intel model=auto

Hence, the suggested fix which should resolve this on many Thinkpad notebooks is up to passing the right module argument:

To fix its neceessery to edit /etc/modprobe.d/alsa-base.conf .

debian:~# vim /etc/modprobe.d/alsa-base.conf

Find the line in the file starting with:
options snd-hda-intel model=

and substitute with:

options snd-hda-intel model=thinkpad

Finally a restart of Advaned Linux Sound Architecture (alsa) is required:

debian:~# /etc/init.d/alsa restart
...

At most cases just restarting the alsa via its init script is not enough, since the ssnd-hda-intel kernel module is already in use by some program or something, so its best to do a reboot to make sure the module is loaded with the new model=thinkpad argument.

My exact laptop sound card model is:

debian:~# lspci |grep -i audio
00:1b.0 Audio device: Intel Corporation 82801H (ICH8 Family) HD Audio Controller (rev 03)

After changing the module and using alsamixer and aumix to make sure mic is unmuted and its volume is high enough, mic sound rec works fine.

How to fix Pulseaudio and Skype crappy sound glitches, choppy sound and crackling on Debian GNU / Linux

Tuesday, January 10th, 2012

I've experienced plenty of problems with Pulseaudio and Skype output sound hell crappy. This stupid proprietary program Skype is a total crap … Anyways again thanks to ArchLinux's wiki, I've used the two mentioned steps to fix all this Skype in / out problems …

1. Fix problems with Glitches, voice skips and crackling In file /etc/pulse/default.pa its necessery to substitute the line;

load-module module-udev-detect

with

load-module module-udev-detect tsched=0

2. Resolve Choppy sound in (Pulseaudio) -> Skype

In /etc/pulse/daemon.conf two lines has to be also substituted:

; default-sample-rate = 44100

Should become;

default-sample-rate = 48000

3. Change /etc/default/pulseaudio to allow dynamic module loading

It is a good idea to the default settings from DISALLOW_MODULE_LOADING=1 to DISALLOW_MODULE_LOADING=0 .This step is not required and I'm not sure if it has some influence on solving sound in / out problems with Skype but I believe it can be helpful in some cases..

So in /etc/default/pulseaudio Substitute:

DISALLOW_MODULE_LOADING=1

to;

DISALLOW_MODULE_LOADING=0

4. Restart PulseAudio server

After the line is changed and substituted a restart of PulseAudio is required. For PulseAudio server restart a gnome session logout is necessery. Just LogOff logged Gnome user and issue cmd:

debian:~# pkill pulseaudio

This will kill any left pulseaudio server previous instances.