Fri Aug 26 21:34:22 EEST 2011

How to configure Squirrel webmail to play nice with Bulgarian UTF-8 character encoding

Yesterday, most of the time I'm playing around with Squirrelmail, finally, time came when I had enough free time to fix the squirrelmail installed on mail.pc-freak.net

The installed version there has been broken after upgrade of the Apache webserver on the FreeBSD and failed with some stupid preg_match exception immediately after a user tries to login, anyways I decided to not install the squirrelmail from freebsd ports but rather download it directly from
squirrelmail.org .

Installation went smoothly, however after testing to send email typing the email in Bulgarian with a default charset of (UTF-8) set from the Desktop machine from which I've written it, suddenly the sent emails encoding ended garbled.
One of my employees complained about receiving emails which are unreadable thus I proceeded immediately to check and fix the webmail letters encoding.

My logical first assumption was that the problem is caused by the FreeBSD missing a correct locale, thus the first thing I did in order to isolate the problem was check the installed locales:

pcfreak# locale -a | grep -i UTF-8|wc -l
56


As the above command output shows an UTF-8 locales was installed so I further checked if a specific locale for Bulgarian UTF-8 - bg_BG.UTF-8 is installed on the system:

pcfreak# locale -a |grep bg_BG.UTF-8
bg_BG.UTF-8


Being sure that the bg_BG.UTF-8 is installed I excluded missing locales as a possible problem cause.

Next I've noticed that locale command returns a default setting for my root and users set to: pcfreak# locale
LANG=en_US.ISO8859-1
LC_CTYPE="C"
LC_COLLATE="C"
LC_TIME="C"
LC_NUMERIC="C"
LC_MONETARY="C"
LC_MESSAGES="C"
LC_ALL=


Obviously the en_US.ISO8859-1 is not compatible with UTF-8, so I had to change a consult with the FreeBSD handbook suggested a way to change the LANG and LC_COLLATE locale set variables by creating a ~/.login_conf inside the user home directory which default locale has to be set.
In my case I assumed that possible the improper LANG is set to the running Apache as Apache is run via the init script /usr/local/etc/rc.d/apache2 , therefore to work it around for apache either I had to add manually:
LANG=bg_BG.UTF-8


somewhere near the beginning of the Apache init script, or alternatively set a proper .login.conf inside the root user home dir, e.g. /root/.login.conf. An example file which sets the default locale for the root user on BSD to LANG=bg_BG.UTF-8 , is shown below:

pcfreak# cat /root/.login_conf
me:\
:charset=UTF-8:\
:lang=bg_BG.UTF-8:


To fix the default encoding to be set to bg_BG.UTF-8 in all shell user accounts existing on pc-freak, I used a small script which copies the /root/.login_conf to all /home directories and immediately after chowns the user to be owned by the respective user, here is bash one liner script used:

pcfreak# cd /home; for i in $(echo *); do cp -rpf /root/.login_conf $i/; chown $i:$i $i/.login_conf; done;


Now after relogging to all active shells the default LANG character setting and LC_COLLATE were changed and I could see this by issuing again the locale command:

pcfreak# locale
LANG=bg_BG.UTF-8
LC_CTYPE="bg_BG.UTF-8"
LC_COLLATE="bg_BG.UTF-8"
LC_TIME="bg_BG.UTF-8"
LC_NUMERIC="bg_BG.UTF-8"
LC_MONETARY="bg_BG.UTF-8"
LC_MESSAGES="bg_BG.UTF-8"
LC_ALL=


To make sure the apache is reading the new LANG locale settings, further on I forced apache restart:

pcfreak# /usr/local/etc/rc.d/apache2 restart


I opened a browser and sent one more mail typed in cyrillic with squirrelmail addressing my own email to test, if finally the mail char encoding issues are gone. But NOO!! still the same issue.

I was out of ideas as it seems there was no logical reason for the cyrillic letters to break when sent via squirrelmail.
And then the lightbulb was up with the idea to check the squirrelmail configuration encoding itself, thus I launched immediately the squirrel ./configure script and guess what, the encoding there was also imroperly SET to en_US.ISO8859-1!

pcfreak# cd /var/www/webmail; ./configure SquirrelMail Configuration : Read: config.php (1.4.0)
---------------------------------------------------------
Main Menu --
1. Organization Preferences
2. Server Settings
...
8. Plugins
9. Database
10. Languages
Command >> 10
SquirrelMail Configuration : Read: config.php (1.4.0)
---------------------------------------------------------
Language preferences
1. Default Language : eu_US
2. Default Charset : en_US.ISO8859-1
3. Enable lossy encoding : false
Command >>


To change the encoding to properly play with Bulgarian, cyrillic in UTF-8 I choose:
Command >> 1
SquirrelMail attempts to set the language in many ways. If it
can not figure it out in another way, it will default to this
language. Please use the code for the desired language.

[en_US]: bg_BG


Command >> 2
SquirrelMail attempts to set the language in many ways. If it
can not figure it out in another way, it will default to this
language. Please use the code for the desired language.
[en_US.ISO8859-1]: bg_BG.UTF-8


Finally to save the new settings into squirrelmail configuration used the S cmd:

Command >> S
...


And Hallelujah! My Bulgarian letters started being properly encoded and sent in squirrelmail ;) thx God