Posts Tagged ‘wget’

How to use wget and curl via HTTP Proxy server / How to set a HTTPS proxy server on a bash shell on Linux

Wednesday, January 27th, 2016

linux-ssl-proxy-configuration-from-command-line-with-wget-and-curl-howto

I've been working a bit on a client's automation, the task is to automate process of installations of Apaches / Tomcats / JBoss and Java servers, so me and colleagues don't waste too
much time in trivial things. To complete that I've created a small repository on a Apache with a WebDav server with major versions of each general branch of Application servers and Javas.
In order to access the remote URL where the .tar.gz binaries archives reside, I had to use a proxy serve as the client runs all his network in a DMZ and all Web Port 80 and 443 HTTPS traffic inside the client network
has to pass by the network proxy.

Thus to make the downloads possible via the shell script, writting I needed to set the script to use the HTTPS proxy server. I've been using proxy earlier and I was pretty aware of the http_proxy bash shell
variable thus I tried to use this one for the Secured HTTPS proxy, however the connection was failing and thanks to colleague Anatoliy I realized the whole problem is I'm trying to use http_proxy shell variable
which has to only be used for unencrypted Proxy servers and in this case the proxy server is over SSL encrypted HTTPS protocol so instead the right variable to use is:
 

https_proxy


The https_proxy var syntax, goes like this:

proxy_url='http-proxy-url.net:8080';
export https_proxy="$proxy_url"

how-to-set-https_proxy_url-on-linux-freebsd-openbsd-bsd-and-unix-from-terminal-console

Once the https_proxy variable is set  UNIX's wget non interactive download tool starts using the proxy_url variable set proxy and the downloads in my script works.

Hence to make the different version application archives download work out, I've used wget like so:
 

 wget –no-check-certificate –timeout=5 https://full-path-to-url.net/file.rar


For other BSD / HP-UX / SunOS UNIX Servers where  shells are different from Bourne Again (Bash) Shell, the http_proxy and  https_proxy variable might not be working.
In such cases if you have curl (command line tool) is available instead of wget to script downloads you can use something like:
 

 curl -O -1 -k –proxy http-proxy-url.net:8080 https://full-path-to-url.net/file.rar

The http_proxy and https_proxy variables works perfect also on Mac OS X, default bash shell, so Mac users enjoy.
For some bash users in some kind of firewall hardened environments like in my case, its handy to permanently set a proxy to all shell activities via auto login Linux / *unix scripts .bashrc or .bash_profile that saves the inconvenience to always
set the proxy so lynx and links, elinks text console browsers does work also anytime you login to shell.

Well that's it, my script enjoys proxying traffic 🙂
 

Share this on

MobaXTerm: A good gnome-terminal like tabbed SSH client for Windows / Windows Putty Tabs Alternative

Wednesday, November 13th, 2013

Mobaxterm ssh client putty MS Windows alternative with tabs suitable for ex linux users

mobaxterm with tabbed ssh connections screenshot best putty windows ssh client alternative now

Last 10+ years I worked on GNU / Linux as Desktop. Last 7 years most of my SSH connections were managed from GNOME and I'm quite used to gnome-terminal ssh tabbing. In my new Employee Hewlett Packard. I'm forced to work on Microsoft Windows 7 and thus I used for a month or so Putty and Kitty fork from version 0.63 of PuTTY advertising itself as the best telnet / SSH client in the world. Both of the two lack tabbing and have interface which is pretty unfamiliar to me. As I'm so used to using native UNIX terminal. Fortunately a colleague of mine Ivelin was using an SSH client called MobaXTerm which very much did emulation similar to my favourite gnome-terminal. MobaXterm is not free software / open source app but this doesn't matter so much to me as anyways I'm running a non-free Win OS on my desktop. What makes MobaXterm so attractive is its rich functionality (cosmic years infront of Putty).

Here is website description of MobaXterm quoted from its website:

MobaXterm is an enhanced terminal for Windows with an X11 server, a tabbed SSH client and several other network tools for remote computing (VNC, RDP, telnet, rlogin). MobaXterm brings all the essential Unix commands to Windows desktop, in a single portable exe file which works out of the box.

Overall list of features MobaXterm offers are;

  •     multitab terminal with embedded Unix commands (ls, cd, cat, sed, grep, awk, rsync, wget, …)

  •     embedded X11 server for easily exporting your Unix/Linux display

  •     a session manager with several network utilities: SSH, RDP, VNC, Telnet, Rlogin, FTP, SFTP and XDMCP

  •     passwords management for SSH, RDP, VNC, SFTP (on demand password saving)

  •     easy graphical file transfer using drag and drop during SSH sessions

  •     advanced SSH tunnels creation tool (graphical port forwarding builder)

  •     tasks automation using scripts or macros

Mobaxterm is portable just like Putty so its useful to use on HOP stations to servers like used in big companies like HP. Featured embedded Unix commands (e.g., ls, cd, cat, sed, grep, awk, rsync, wget) gives a feeling like you're working on pure Linux console making people addicted to Linux / BSD quite confortable. Some other very useful terminal emulator functions are support for anti-aliasing session manager (save / remember passwords for ssh sessions in Crypted format so much missing in Putty) and it even supports basic macros.
Basic UNIX commands embedded in MobaXterm are taken and ported from Cygwin projectLinux-like environment for Windows making it possible to port software running on POSIX systems (such as Linux, BSD, and Unix systems) to Windows. A very cool think is also MobaXterm gives you a Linux like feel of console navigation in between basic files installed from Cygwin. Some downside I found is program menus which look at first glimpse a bit confusing especially for people used to simplicity of gnome-terminal. Once logged in to remote host via ssh command the program offers you to log you in also via SFTP protocol listing in parallel small window with possibility to navigate / copy / move etc. between server files in SFTP session which at times is pretty useful as it saves you time to use some external SFTP connector tools like  WinSCP.

From Tools configuration menu, there are few precious tools as well;
         – embedded text editor MobaTextEditor
         – MobaFoldersDiff (Able to show diffeernces between directories)
         – AsciiTable (Complete List of Ascii table with respective codes and characters)
         – Embedded simple Calculator
         – List open network ports – GUI Tool to list all open ports on Windows localhost
         – Network packets capture – A Gui tool showing basic info like from UNIX's tcpdump!
         – Ability to start quickly on local machine (TFTP, FTP, SFTP / SSH server, Telnet server, NFS server, VNC Server and even simple implementation of HTTP server)

Mobaxterm list of tools various stuff

         Mobaxterm run various services quickly on Windows servers management screenshot

Below are few screenshots to get you also idea about what kind of configuration MobaXterm supports
  mobaxterm terminal configuration settings screenshot

mobaxterm better putty alternative x11 configuration tab screenshot

mobaxterm windows ssh client for linux users configuration ssh tab screenshot

mobaxterm-putty-alternative-for-windows-configuration-display-screenshot
MobaXTerm Microsoft Windows ssh client configuration misc menu screenshot
To configure and use Telnet, RSH, RDP, VNC, FTP etc. Sessions use the Sessions tab on top menu.

One very handy thing is MobaXterm supports export of remote UNIX display with no requirement to install special Xserver like already a bit obsolete Xming – X server for Windows.
The X Display Manager Control Protocol (XCMCP) is a key feature of the X11 architecture. Together with XDMCP, the X network protocol allows distributed operation of the X server and X display manager. The requesting X server runs on the client (usually as an X terminal), thus providing a login service, that why the X server ported to MobaXterm from Cygwin also supports XDMCP. If, for example, you want to start a VNC session with a remote VNC server, all you have to do is enter the remote VNC server’s IP address in the VNC area; the default VNC port is already registered.

Accessing the remote Windows server via RDP (Remote Desktop Protocol) is also a piece of cake. Once you establish a session to RDP or other Proto it is possible to save this session so later you just choose between session to access. The infamous (X11 Port Forwarding) or creation of SSH encrypted tunnels between hosts to transfer data securily or hide your hostname is also there.

MobaXterm is undoubtedly a very useful and versatile tool. Functionally, the software is well mannered, and Windows users who want to sniff a little Linux/Unix air can get a good idea of how Linux works. A closer look reveals that anything you can do with MobaXterm can be achieved directly with freely available tools (Cygwin) and Unix tools ported from Cygwin. However, although Cygwin provides a non-Posix environment for Windows, it doesn’t offer a decent terminal, which is one thing Moba-Xterm has going for it.

Admittedly, in pure vanilla Cygwin, you can start an X server automatically and then use xterm, but xterm lacks good-quality fonts, whereas MobaXterm conveniently lets you integrate a font server.

Share this on

Adding Teamviewer to auto start on Linux GNOME login

Friday, February 1st, 2013

Administrating Linux via graphical interface is not common, however sometimes it is necessery. There are plenty of ways to remotely administrate with GUI Linux. You can connect to remote Xserver and launch X session via xinit, connect via (Gnome Display Manager) GDM, use nomachine NX server / client (if you're on slow connection line) or use the good old Teamviewer.

As Teamviewer works pretty well on both Windows and Linux in last times I like using teamviewer as a standard. It is freeware and it often disconnects with the annoying Trial message, but in general for managing something quick on remote desktop it is nice.

To use teamviewer, you need to have it installed on the Linux host via deb or rpm:

Whether on Debian / Ubuntu use:

 

# wget http://www.teamviewer.com/download/version_8x/teamviewer_linux_x64.deb
.....
# dpkg -i teamviewer_linux_x64.deb
...
....

 

On Fedora, CentOS, RHEL run:

# wget -q http://www.teamviewer.com/download/teamviewer_linux.rpm
# rpm -Uvh teamviewer_linux.rpm
.....

Once package is installed teamviewer is installed in /opt/teamviewer/* there is a tiny wrapper run script in /usr/bin/teamviewer – evoking TeamViewer to be run via wine emulation.

Hence to make TeamViewer start on certain user GNOME login the script has to run on GNOME user login session.
In both GNOME 2 and GNOME 3 what is run on user login is managed through gnome-session-properties thus /usr/bin/teamviewer has to be set to run through gnome-session-properties (to run it press ALT+F2 or type it directly in gnome-terminal)

user@linux:~$ gnome-session-properties

A window like in below screenshot pops up and from there Add TeamViewer.

Adding Teamviewer to auto start on Linux Debian Fedora CentOS GNOME non privileged user

To be able to later connect via a Remote host with another TeamViewer peer launch TeamViewer and configure permanent password through menus:

Extras -> Options -> Security

teamviewer extras options security configuring teamviewier permanent password for ID

All left is to write down your Teamviewer Remote Connect ID and permanent set password

Teamviewer remote connect ID screnshot Linux

After next succesful GNOME login teamviewer will just pop-up. Enjoy

Share this on

New version of Teamviewer 8 beta for GNU / Linux is out

Monday, January 28th, 2013

I had to login to a Windows host, after a friend handed in his Teamviewer ID and pass. I tried connecting few times and connection failed every time. It was not very likely it is a network problem as I have pretty good internet bandwidth here. I ask him what is his teamviewer, release and it turned out he is using TeamViewer 8. Just few months ago I've updated my teamviewer from TV 6 to TV 7 after finally Teamviewer 7 was launched for Linux. I check immediately TeamViewer website, but this time (TeamViewer GmbH – the vendor company of TV) has launched TV 8 for Linux in parallel with launching TV 8  for Windows.

Of course like often in Linux it is not all perfect TeamViewer 8 version is still in beta testing stage. I've downloaded and updated my (64 bit) teamviewer 7 to teamviewer 8 using the Deb binary, i.e.:

# wget http://www.teamviewer.com/download/version_8x/teamviewer_linux_x64.deb
.....
# dpkg -i teamviewer_linux_x64.deb
...
....

Just like previous TeamViewer, releases, the program launches via wine, so teamviewer_linux_x64.deb depends on working wine Windows emulator.
There are minor difference in graphic interface but no so significant changes from previous release.

TeamViewer beta 8 ver on Debian GNU linux screenshot

I used it to connect to remote Windows 8 host and experienced random issues with typing some characters from keyboard remotely, though I'm not sure if this was due to some bug of the TV 8 beta release or it is due to fact the Win host was infested with viruses.
Just for some convenience if I further need it, I've also made mirror of 32 bit TeamViewer 8 beta here and 64 bit TeamViewer 8 beta mirror here.

 

Share this on

How to clean QMAIL mail server filled Queue with spam mail messages

Friday, June 8th, 2012

Sys Admins managing QMAIL mail servers know, often it happens QMAIL queue gets filled with unwanted unsolicated SPAM e-mails due to a buggy WEB PHP / Perl mail sendingform or some other odd reason, like too many bouncing messages ,,,,

For one more time I’ve experienced the huge SPAM destined mails queue on one QMAIL running on Debian GNU / Linux.

= … Hence I needed to clean up the qmail queue. For this there is a little tool written in PERL called qmhandle

A+) Download qmhandle;

Qmhandle’s official download site is in SourceForge

I’ve made also a mirror copy of QMHandle here

qmail-server:~# cd /usr/local/src
qmail-server:/usr/local/src# wget -q http://www.pc-freak.net/files/qmhandle-1.3.2.tar.gz

B=) STOP QMAIL
As it is written in the program documentation one has to be very careful when cleaning the mail queue. Be sure to stop qmail with qmailctl or whatever script is used to shutdown any mail sever in progress operations, otherwise there is big chance the queue to mess up badly .

C#) Check extended info about the mail queue:

qmail-server:/usr/local/src/qmhandle-1.3.2# ./qmhandle -l -c
102 (10, 10/102) Return-path: anonymous@qmail-hostname.com
From: QMAIL-HOSTNAME
To: as1riscl1.spun@gmail.com
Subject: =?UTF8?B?QWNjb3VudCBpbmZvcm1hdGlvbiBmb3IgU09DQ0VSRkFNRQ==?=
Date: 1 Sep 2011 21:02:16 -0000
Size: 581 bytes
,,,,
1136 (9, 9/1136)
Return-path: werwer@qmail-hostname.com.tw
From: martin.georgiev@qmail-hostname.com
To: costador4312@ukr.net
Subject: Link Exchange Proposal / Qmail-Hostname.com
Date: Fri, 2 Sep 2011 07:58:52 +0100 (BST)
Size: 1764 bytes
,,,,....
1103 (22, 22/1103)
Return-path: anonymous@qmail-hostname.com
From: SOCCERFAME
To: alex.masdf.e.kler.1@gmail.com
Subject: =?UTF8?B?QWNjb3VudCBpbmZvcm1hdGlvbiBmb3IgU09DQ0VSRkFNRQ==?=
Date: 2 Sep 2011 00:36:11 -0000
Size: 578 bytes
,,,,,,
,,,,....
Total messages: 1500 Messages with local recipients: 0
Messages with remote recipients: 1500
Messages with bounces: 500
Messages in preprocess: 300

D-) Delete the Queue
qmail-server:/usr/local/src/qmhandle-1.3.2# ./qmHandle -D
......
......

Fina1ly launch the qmail to continue normal oper.

qmail-server:~# qmailctl start
,,,,,
..,,,

Share this on

How to copy / clone installed packages from one Debian server to another

Friday, April 13th, 2012

1. Dump all installed server packages from Debian Linux server1

First it is necessery to dump a list of all installed packages on the server from which the intalled deb packages 'selection' will be replicated.

debian-server1:~# dpkg --get-selections \* > packages.txt

The format of the produced packages.txt file will have only two columns, in column1 there will be the package (name) installed and in column 2, the status of the package e.g.: install or deinstall

Note that you can only use the –get-selections as root superuser, trying to run it with non-privileged user I got:

hipo@server1:~$ dpkg --set-selections > packages.txt
dpkg: operation requires read/write access to dpkg status area

2. Copy packages.txt file containing the installed deb packages from server1 to server2

There is many way to copy the packages.txt package description file, one can use ftp, sftp, scp, rsync … lftp or even copy it via wget if placed in some Apache directory on server1.

A quick and convenient way to copy the file from Debian server1 to server2 is with scp as it can also be used easily for an automated script to do the packages.txt file copying (if for instance you have to implement package cloning on multiple Debian Linux servers).

root@debian-server1:~# scp ./packages.txt hipo@server-hostname2:~/packages.txt
The authenticity of host '83.170.97.153 (83.170.97.153)' can't be established. RSA key fingerprint is 38:da:2a:79:ad:38:5b:64:9e:8b:b4:81:09:cd:94:d4. Are you sure you want to continue connecting (yes/no)? yes Warning: Permanently added '83.170.97.153' (RSA) to the list of known hosts. hipo@83.170.97.153's password:
packages.txt

As this is the first time I make connection to server2 from server1, I'm prompted to accept the host RSA unique fingerprint.

3. Install the copied selection from server1 on server2 with apt-get or dselect

debian-server2:/home/hipo# apt-get update
...
debian-server2:/home/hipo# apt-get upgrade
Reading package lists... Done
Building dependency tree
Reading state information... Done
debian-server2:/home/hipo# dpkg --set-selections < packages.txt
debian-server2:/home/hipo# apt-get -u dselect-upgrade --yes

The first apt-get update command assures the server will have the latest version of the packages currently installed, this will save you from running an outdated versions of the installed packages on debian-server2

Bear in mind that using apt-get sometimes, might create dependency issues. This is depending on the exact package names, being replicated in between the servers

Therefore it is better to use another approach with bash for loop to "replicate" installed packages between two servers, like so:

debian-server2:/home/hipo# for i in $(cat packages.txt |awk '{ print $1 }'); do aptitude install $i; done

If you want to automate the questioning about aptitude operations pass on the -y

debian-server2:/home/hipo# for i in $(cat packages.txt |awk '{ print $1 }'); do aptitude -y install $i; done

Be cautious if the -y is passed as sometimes some packages might be removed from the server to resolve dependency issues, if you need this packages you will have to again install them manually.

4. Mirroring package selection from server1 to server2 using one liner

A quick one liner, that does replicate a set of preselected packages from server1 to server2 is also possible with either a combination of apt, ssh, awk and dpkg or with ssh + dpkg + dselect :

a) One-liner code with apt-get unifying the installed packages between 2 or more servers

debian-server2:~# apt-get --yes install `ssh root@debian-server1 "dpkg -l | grep -E ^ii" | awk '{print $2}'`
...

If it is necessery to install on more than just debian-server2, copy paste the above code to all servers you want to have identical installed packages as with debian-server1 or use a shor for loop to run the commands for each and every host of multiple servers group.

In some cases it might be better to use dselect instead as in some situations using apt-get might not correctly solve the package dependencies, if encountering problems with dependencies better run:

debian-server2:/home/hipo# ssh root@debian-server1 'dpkg --get-selections' | dpkg --set-selections && dselect install

As you can see using this second dselect installed "package" mirroring is also way easier to read and understand than the prior "cryptic" method with apt-get, hence I personally think using dselect method is a better.

Well that's basically it. If you need to synchronize also configurations, either an rsync/scp shell script, should be used with all defined server1 config files or in case if a cloning of packages between identical server machines is necessery dd or some other tool like Norton Ghost could be used.
Hope this helps, someone.

Share this on

How to run your Own / Personal Domain Web WHOIS service in a minute with SpeedyWHOIS

Thursday, April 5th, 2012

Running your own personal WHOIS service speedy whois in browser screenshot

I've been planning to run my own domain WHOIS service, for quite sime time and I always postpone or forgot to do it.
If you wonder, why would I need a (personal) web whois service, well it is way easier to use and remember for future use reference if you run it on your own URL, than wasting time in search for a whois service in google and then using some other's service to get just a simple DOMAIN WHOIS info.

So back to my post topic, I postpopned and postponed to run my own web whois, just until  yesterday, whether I have remembered about my idea to have my own whois up and running and proceeded wtih it.

To achieve my goal I checked if there is free software or (open source) software that easily does this.
I know I can write one for me from scratch, but since it would have cost me some at least a week of programming and testing and I didn't wanted to go this way.

To check if someone had already made an easy to install web whois service, I looked through in the "ultimate source for free software" sourceforge.net

Looking for the "whois web service" keywords, displayed few projects on top. But unfortunately many of the projects sources was not available anymore from http://sf.net and the project developers pages..
Thanksfully in a while, I found a project called SpeedyWhois, which PHP source was available for download.

With all prior said about project missing sources, Just in case if SpeedyWhois source  disappears in the future (like it probably) happened with, some of the other WHOIS web service projects, I've made SpeedyWhois  mirror for download here

 
Contrary to my idea that installing the web whois service might be a "pain in the ass", (like is the case  with so many free software php scripts and apps) – the installation went quite smoothly.
 
To install it I took the following 4 steps:
 
1. Download the source (zip archive) with wget 
 
# cd /var/www/whois-service;
/var/www/whois-service# wget -q http://www.pc-freak.net/files/speedywhois-0.1.4.zip
 
2. Unarchive it with unzip command 
 
 
/var/www/whois-service# unzip speedywhois-0.1.4.zip
3. Set the proper DNS records

My NS are using Godaddy, so I set my desired subdomain record from their domain name manager.
 

4. Edit Apache httpd.conf to create VirtualHost
 
This step is not mandatory, but I thought it is nice if I put the whois service under a subdomain, so add a VirtualHost to my httpd.conf
 
The Virtualhost Apache directives, I used are:
 
<VirtualHost *:80>
        ServerAdmin hipo_aT_pc-freak.net
        DocumentRoot /var/www/whois-service
        ServerName whois.pc-freak.net
        &lt;Directory /var/www/whois-service
        AllowOverride All
        Order Allow,Deny
        Allow from All
        </Directory>
</VirtualHost>
 
Onwards to take effect of new Webserver configs, I did Apache restart
 
# /usr/local/etc/rc.d/apache2 restart
 
Further on You can test whois a domain using my new installed SpeedyWHOISWeb WHOIS service  on http://whois.pc-freak.net
Whenever I have some free time, maybe I will work on the code, to try to add support for logging of previous whois requests and posting links pointing to the previous whois done via the web WHOIS service on the main whois page.
 
One thing that I disliked about how SpeedyWHOIS is written is, if there is no WHOIS information returned for a domain request (e.g.) a:
 
# whois domainname.com
 
returns an empty information, the script doesn't warn with a message there is no WHOIS data available for this domain or something.
 
 
This is not so important as this kind of behaviour of 'error' handling can easily be changed with minimum changes in the php code.
If you wonder, why do I need the web whois service, the answer is it is way easier to use.
I don't have more time to research a bit further on the alternative open source web whois services, so I would be glad to hear from anyone who tested other web whois service that is free comes under a FOSS license.
In the mean time, I'm sure people with a small internet websites like mine who are looking to run their OWN (personal) whois service SpeedyWHOIS does a great job.

Share this on

How to make a mirror of website on GNU / Linux with wget / Few tips on wget site mirroring

Wednesday, February 22nd, 2012

how-to-make-mirror-of-website-on-linux-wget

Everyone who used Linux is probably familiar with wget or has used this handy download console tools at least thousand of times. Not so many Desktop GNU / Linux users like Ubuntu and Fedora Linux users had tried using wget to do something more than single files download.
Actually wget is not so popular as it used to be in earlier linux days. I've noticed the tendency for newer Linux users to prefer using curl (I don't know why).

With all said I'm sure there is plenty of Linux users curious on how a website mirror can be made through wget.
This article will briefly suggest few ways to do website mirroring on linux / bsd as wget is both available on those two free operating systems.

1. Most Simple exact mirror copy of website

The most basic use of wget's mirror capabilities is by using wget's -mirror argument:

# wget -m http://website-to-mirror.com/sub-directory/

Creating a mirror like this is not a very good practice, as the links of the mirrored pages will still link to external URLs. In other words link URL will not pointing to your local copy and therefore if you're not connected to the internet and try to browse random links of the webpage you will end up with many links which are not opening because you don't have internet connection.

2. Mirroring with rewritting links to point to localhost and in between download page delay

Making mirror with wget can put an heavy load on the remote server as it fetches the files as quick as the bandwidth allows it. On heavy servers rapid downloads with wget can significantly reduce the download server responce time. Even on a some high-loaded servers it can cause the server to hang completely.
Hence mirroring pages with wget without explicity setting delay in between each page download, could be considered by remote server as a kind of DoS – (denial of service) attack. Even some site administrators have already set firewall rules or web server modules configured like Apache mod_security which filter requests to IPs which are doing too frequent HTTP GET /POST requests to the web server.
To make wget delay with a 10 seconds download between mirrored pages use:

# wget -mk -w 10 -np --random-wait http://website-to-mirror.com/sub-directory/

The -mk stands for -m/-mirror and -k / shortcut argument for –convert-links (make links point locally), –random-wait tells wget to make random waits between o and 10 seconds between each page download request.

3. Mirror / retrieve website sub directory ignoring robots.txt "mirror restrictions"

Some websites has a robots.txt which restricts content download with clients like wget, curl or even prohibits, crawlers to download their website pages completely.

/robots.txt restrictions are not a problem as wget has an option to disable robots.txt checking when downloading.
Getting around the robots.txt restrictions with wget is possible through -e robots=off option.
For instance if you want to make a local mirror copy of the whole sub-directory with all links and do it with a delay of 10 seconds between each consequential page request without reading at all the robots.txt allow/forbid rules:

# wget -mk -w 10 -np -e robots=off --random-wait http://website-to-mirror.com/sub-directory/

4. Mirror website which is prohibiting Download managers like flashget, getright, go!zilla etc.

Sometimes when try to use wget to make a mirror copy of an entire site domain subdirectory or the root site domain, you get an error similar to:

Sorry, but the download manager you are using to view this site is not supported.
We do not support use of such download managers as flashget, go!zilla, or getright

This message is produced by the site dynamic generation language PHP / ASP / JSP etc. used, as the website code is written to check on the browser UserAgent sent.
wget's default sent UserAgent to the remote webserver is:
Wget/1.11.4

As this is not a common desktop browser useragent many webmasters configure their websites to only accept well known established desktop browser useragents sent by client browsers.
Here are few typical user agents which identify a desktop browser:
 

  • Mozilla/5.0 (Windows NT 6.1; rv:6.0) Gecko/20110814 Firefox/6.0
  • Mozilla/5.0 (X11; Linux i686; rv:6.0) Gecko/20100101 Firefox/6.0
  • Mozilla/6.0 (Macintosh; I; Intel Mac OS X 11_7_9; de-LI; rv:1.9b4) Gecko/2012010317 Firefox/10.0a4
  • Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:2.2a1pre) Gecko/20110324 Firefox/4.2a1pre

etc. etc.

If you're trying to mirror a website which has implied some kind of useragent restriction based on some "valid" useragent, wget has the -U option enabling you to fake the useragent.

If you get the Sorry but the download manager you are using to view this site is not supported , fake / change wget's UserAgent with cmd:

# wget -mk -w 10 -np -e robots=off \
--random-wait
--referer="http://www.google.com" \--user-agent="Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.6) Gecko/20070725 Firefox/2.0.0.6" \--header="Accept:text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5" \--header="Accept-Language: en-us,en;q=0.5" \--header="Accept-Encoding: gzip,deflate" \--header="Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7" \--header="Keep-Alive: 300"

For the sake of some wget anonimity – to make wget permanently hide its user agent and pretend like a Mozilla Firefox running on MS Windows XP use .wgetrc like this in home directory.

5. Make a complete mirror of a website under a domain name

To retrieve complete working copy of a site with wget a good way is like so:

# wget -rkpNl5 -w 10 --random-wait www.website-to-mirror.com

Where the arguments meaning is:
-r – Retrieve recursively
-k – Convert the links in documents to make them suitable for local viewing
-p – Download everything (inline images, sounds and referenced stylesheets etc.)
-N – Turn on time-stamping
-l5 – Specify recursion maximum depth level of 5

6. Make a dynamic pages static site mirror, by converting CGI, ASP, PHP etc. to HTML for offline browsing

It is often websites pages are ending in a .php / .asp / .cgi … extensions. An example of what I mean is for instance the URL http://php.net/manual/en/tutorial.php. You see the url page is tutorial.php once mirrored with wget the local copy will also end up in .php and therefore will not be suitable for local browsing as .php extension is not understood how to interpret by the local browser.
Therefore to copy website with a non-html extension and make it offline browsable in HTML there is the –html-extension option e.g.:

# wget -mk -w 10 -np -e robots=off \
--random-wait \
--convert-links http://www.website-to-mirror.com

A good practice in mirror making is to set a download limit rate. Setting such rate is both good for UP and DOWN side (the local host where downloading and remote server). download-limit is also useful when mirroring websites consisting of many enormous files (documental movies, some music etc.).
To set a download limit to add –limit-rate= option. Passing by to wget –limit-rate=200K would limit download speed to 200KB.

Other useful thing to assure wget has made an accurate mirror is wget logging. To use it pass -o ./my_mirror.log to wget.
 

Share this on

How to show country flag, web browser type and Operating System in WordPress Comments

Wednesday, February 15th, 2012

!!! IMPORTANT UPDATE COMMENT INFO DETECTOR IS NO LONGER SUPPORTED (IS OBSOLETE) AND THE COUNTRY FLAGS AND OPERATING SYSTEM WILL BE NOT SHOWING INSTEAD,

!!!! TO MAKE THE COUNTRY FLAGS AND OS WP FUNCTIONALITY WORK AGAIN YOU WILL NEED TO INSTALL WP-USERAGENT !!!

I've come across a nice WordPress plugin that displays country flag, operating system and web browser used in each of posted comments blog comments.
Its really nice plugin, since it adds some transperancy and colorfulness to each of blog comments 😉
here is a screenshot of my blog with Comments Info Detector "in action":

Example of Comments Info Detector in Action on wordpress blog comments

Comments Info Detector as of time of writting is at stable ver 1.0.5.
The plugin installation and configuration is very easy as with most other WP plugins. To install the plugin;

1. Download and unzip Comments Info Detector

linux:/var/www/blog:# cd wp-content/plugins
linux:/var/www/blog/wp-content/plugins:# wget http://downloads.wordpress.org/plugin/comment-info-detector.zip
...
linux:/var/www/blog/wp-content/plugins:# unzip comment-info-detector.zip
...

Just for the sake of preservation of history, I've made a mirror of comments-info-detector 1.0.5 wp plugin for download here
2. Activate Comment-Info-Detector

To enable the plugin Navigate to;
Plugins -> Inactive -> Comment Info Detector (Activate)

After having enabled the plugin as a last 3rd step it has to be configured.

3. Configure comment-info-detector wp plugin

By default the plugin is disabled. To change it to enabled (configure it) by navigating to:

Settings -> Comments Info Detector

Next a a page will appear with variout fields and web forms, where stuff can be changed. Here almost all of it should be left as it is the only change should be in the drop down menus near the end of the page:

Display Country Flags Automatically (Change No to Yes)
Display Web Browsers and OS Automatically (Change No to Yes

Comments Info Detector WordPress plugin configuration Screenshot

After the two menus are set to "Yes" and pressing on Save Changes the plugin is enabled it will immediately start showing information inside each comment the GeoIP country location flag of the person who commented as well as OS type and Web Browser 🙂

Share this on