Fri Oct 21 15:46:18 EEST 2011

How to convert UTF-8 encoding files to Windows CP1251 on GNU / Linux

I needed to convert a file which had a Bulgarian text written in UTF-8 encoding to Windows CP1251 in order to fix a website encoding problems after a move of the website from one physical server to another.

I tried first with enca - ( detects and convert encoding of text files from one encoding to another).

The exact way I tried to convert was:

linux:~# enca -L bg /home/site/www/includes/utf8_encoded_file.php
...
Unfortunately this attempt to conver was unsucesfully, and the second logical guess was to use iconv - Convert encoding of given files from one encoding to another to do the utf8 to cp1251 conversion.
I reached for some help in irc.freenode.net, #varnalab channel and Alex Kuklin helped me, giving me an example command line to do the conversion.
iconv winedows to cp1251 conversion line, he pointed to me was:

linux:~# iconv -f utf8 -t cp1251 < in > out


Further on I adapted Alex's example to convert my utf8_encoded_file.php encoded Bulgarian characted to CP1251 and used the following commands to convert and create backups of my original UTF8 file:

linux:~# cd /home/site/www/includes
linux:/home/site/www/includes# iconv -f utf8 -t cp1251 < utf8_encoded_file.php in > utf8_encoded_file.php.cp1251
linux:/home/site/www/includes# mv utf8_encoded_file.php utf8_encoded_file.php.bak
linux:/home/site/www/includes# mv utf8_encoded_file.php.cp1251 utf8_encoded_file.php