Posts Tagged ‘linux hosts’

Linux: Convert recursively files content from WINDOWS-CP1251 to Unicode UTF-8 with recode and iconv

Wednesday, January 9th, 2013

 

Linux How to make mass file convert of charset windows CP1251 toutf8 and to other encodings

Some time ago I've written a tiny article, explaining how converting of HTML or TEXT file content inside file can be converted with iconv.

Just recently, I've made mirror of a whole website with its directory structure with wget cmd. The website to be mirrored was encoded with charset Windows-1251 (which is now a bit obsolete and not very recommended to use), where my Apache Webserver to which I mirrored is configured by default to deliver file content (.html, txt, js, css …) in newer and more standard (universal cyrillic) compliant UTF-8 encoding. Thus opening in browser from my website, the website was delivered in UTF-8, whether the file content itself was with encoding Windows CP-1251; Thus I ended up seeing a lot of monkey unreadable characters instead of Slavonic letters. To deal with the inconvenience, I've used one liner script that converts all Windows-1251 charset files to UTF-8. This triggered me writting this little post, hoping the info might be useful to others in a similar situation to mine:

1. Make Mass file charset / encoding convertion with recode

On most Linux hosts, recode is probably not installed. If you're on Debian / Ubuntu Linux install it with apt;

apt-get install --yes recode

It is also installable from default repositories on Fedora, RHEL, CentOS with:

 

yum -y install recode

Here is recode description taken from man page:

NAME
       recode – converts files between character sets

find . -name "*.html" -exec recode WINDOWS-1251..UTF-8 {} \;

If you have few file extensions whose chracter encoding needs to be converted lets say .html, .htm and .php use cmd:

find . -name "*.html" -o -name '*.htm' -o -name '*.php' -exec recode WINDOWS-1251..UTF-8 {} \;

Btw I just recently learned how one can look for few, file extensions with find under one liner the argument to pass is -o -name '*.file-extension', as you can see from  example, you can look for as  many different file extensions as you like with one find search command.

After completing the convertion, I've remembered that earlier I've also used iconv on a couple of occasions to convert from Cyrillic CP-1251 to Cyrillic UTF-8, thus for those who prefer to complete convertion with iconv here is an alternative a bit longer method using for cycle + mv and iconv.

2. Mass file convertion with iconv

for i in $(find . -name "*.html" -print); do
iconv -f WINDOWS-1251 -t UTF-8 $i > $i.utf-8;
mv $i $i.bak;
mv $i.utf-8 $i;
done

As you see in above line of code, there are two occurances of move command as one is backupping all .html files and second mv overwrites with files with converted encoding. For any other files different from .html, just change in cmd find . -iname '*.html' to whatever file extension.

How to fix “Fatal error: Call to undefined function: curl_init()” on FreeBSD and Debian

Saturday, June 18th, 2011

After installing the Tweet Old Post wordpress plugin and giving it, I’ve been returned an error of my PHP code interpreter:

Call to undefined function: curl_init()

As I’ve consulted with uncle Google’s indexed forums 😉 discussing the issues, I’ve found out the whole issues are caused by a missing php curl module

My current PHP installation is installed from the port tree on FreeBSD 7.2. Thus in order to include support for php curl it was necessery to install the port /usr/ports/ftp/php5-curl :

freebsd# cd /usr/ports/ftp/php5-curl
freebsd# make install clean

(note that I’m using the php5 port and it’s surrounding modules).

Fixing the Call to undefined function: curl_init() on Linux hosts I suppose should follow the same logic, e.g. one will have to install php5-curl to resolve the issue.
Fixing the missing curl_init() function support on Debian for example will be as easy as using apt to install the php5-curl package, like so:

debian:~# apt-get install php5-curl
...

Now my tweet-old-post curl requirement is matched and the error is gone, hooray 😉

How to Split files on Linux FreeBSD, NetBSD and OpenBSD

Sunday, July 31st, 2011

Split large files in pieces Scissors

Did you have the need to sometimes split an SQL extra large files to few pieces in order to be able to later upload it via phpmyadmin?
Did you needed an extra large video or data file to be cut in few pieces in order to transfer it in few pieces over an USB stick?
Or just to give you an another scenario where I sometimes need to have an enormous file let’s say 3G split in few pieces, in order to later read it in vim or mcedit .
I sometimes need to achieve this on FreeBSD and Linux hosts thus I thought it will be helpful to somebody to give a very quick tutorial on the way large files can be cut in pieces on Linux and BSD hosts.

GNU/Linux and FreeBSD are equipped with the split command. The purpose of this command is exactly the cutting of a file to a number of pieces.

On Linux the split command comes by default install to the system with the coreutils package on most Debian (deb) based and Redhat based (rpm) distributions, theerefore Linux’s version of split is GNU/split since it’s part of the GNU Coreutils package. An interesting fact about Linux split is that one of the two programmers who has coded it is Richard Stallman 😉

On BSD Unix split is the AT&T UNIX (BSD) split

In the past splitting files in pieces was much more needed than today, as people used floppy drives to transfer data, though today with the bloom of Internet and the improve of the data carriers transferring even an extra large files from one place to another is a way more trivial task still at many occasions splitting it in pieces is needed.

Even though today splitting file is very rarely required, still there are times when being able to split a file in X number of parts is very much needed.
Maybe the most common use of splitting a file today is necessery when a large SQL file dumps, like let’s say 200 MBytes of info database needs to be moved from ane hosting provider to another one.
Many hosting providers does disallow direct access with standard mySQL client programs to the database directly and only allow a user to connect only via phpMyAdmin or some other web interface like Cpanel to improve data into the SQL or PostgreSQL server.

In such times, having knowledge on the Unix split command is a priceless asset.

Even though on Linux and BSD the code for the split command is not identical and GNU/split and BSD/split has some basic differences, the use of split on both of these Unices is identical.
The way to split a file in few pieces using on both Linux and BSD OSes is being done with one and the same command, here is how:

1. Splitting file in size of 40 mb On Linux

linux:~# split -b 40m SQL-Backup-Data.sql SQL-Backup-Data_split

2. Splitting file in size of 40mb on BSD (FreeBSD, OpenBSD, NetBSD)

freebsd# split -b 40m SQL-Backup-Data.sql SQL-Backup-Data_split

The Second argument the split command takes is actually called a prefix, the prefix is used as a basis name for the creation of the newly generated files cut in pieces file based on SQL-Backup-Data.sql.

As I said identical command will split the SQL-Backup-Data.sql files in a couple of parts which of it will be sized 40 megas.

These command will generate few files output like:

freebsd# ls -1 SQL-Backup-Dat*SQL-Backup-Data.sql
SQL-Backup-Dataa
SQL-Backup-Dataab
SQL-Backup-Dataac
SQL-Backup-Dataad
SQL-Backup-Dataae

As you see the SQL-Backup-Data.sql with size 200MB is being split in four files each of which is sized 40mbytes.

After the files are transfered to another Linux or BSD host, they can easily be again united in the original file with the command:

linux:~# for i in $(ls -1 SQL-Backup-Data_split*); echo $i >> SQL-Backup-Data.sql

Alternatively in most Unices also using cat should be enough to collect back the pieces into the original file, like so:

freebsd# cat SQL-Backup-Data_split* >> SQL-Backup-Data.sql

Enjoy splitting

Universal way to configure a static IP address on ethernet lan (eth0) interface in Linux

Friday, April 29th, 2011

One of the most precious commands I ever learned to use in Linux is ifconfig and route .

They have saved my life in configuring the static IP based internet of numerous Desktop Linux computers & notebooks.

Though the usage is very much known by most of the people who are into Linux, I believe it’s likely that the newer people who entered the world of Linux or some Unix system administrators are still lacking the knowledge on how to manually configure their eth0 lan card, thus I thought it might be handy for someone to share it, I know that for most unix users & admins especially the advanced ones this post might be funny, so if you’re an advanced administrator just skip the post and don’t laught at it 😉

Now the universal commands (works on each and every Linux host) to configure manually static IP internet connection on Linux are:

linux:~# /sbin/ifconfig eth0 192.168.0.3 netmask 255.255.255.0
linux:~# /sbin/route add default gw 192.168.0.1
linux:~# echo 'nameserver 192.168.0.1' >> /etc/resolv.conf

I’ve used this simple commands on thousands ot Linux hosts and it’s still handy 🙂

In above example 192.168.0.3 is the static IP address provided by the ISP, netmask is the netmask and the second /sbin/route add default gw would set the default gateway to the example ip 192.168.0.1

The third final line would add up a resolver nameserver the Linux host would use.

Cheers 😉