Posts Tagged ‘converts’

Linux: Convert recursively files content from WINDOWS-CP1251 to Unicode UTF-8 with recode and iconv

Wednesday, January 9th, 2013

 

Linux How to make mass file convert of charset windows CP1251 toutf8 and to other encodings

Some time ago I've written a tiny article, explaining how converting of HTML or TEXT file content inside file can be converted with iconv.

Just recently, I've made mirror of a whole website with its directory structure with wget cmd. The website to be mirrored was encoded with charset Windows-1251 (which is now a bit obsolete and not very recommended to use), where my Apache Webserver to which I mirrored is configured by default to deliver file content (.html, txt, js, css …) in newer and more standard (universal cyrillic) compliant UTF-8 encoding. Thus opening in browser from my website, the website was delivered in UTF-8, whether the file content itself was with encoding Windows CP-1251; Thus I ended up seeing a lot of monkey unreadable characters instead of Slavonic letters. To deal with the inconvenience, I've used one liner script that converts all Windows-1251 charset files to UTF-8. This triggered me writting this little post, hoping the info might be useful to others in a similar situation to mine:

1. Make Mass file charset / encoding convertion with recode

On most Linux hosts, recode is probably not installed. If you're on Debian / Ubuntu Linux install it with apt;

apt-get install --yes recode

It is also installable from default repositories on Fedora, RHEL, CentOS with:

 

yum -y install recode

Here is recode description taken from man page:

NAME
       recode – converts files between character sets

find . -name "*.html" -exec recode WINDOWS-1251..UTF-8 {} \;

If you have few file extensions whose chracter encoding needs to be converted lets say .html, .htm and .php use cmd:

find . -name "*.html" -o -name '*.htm' -o -name '*.php' -exec recode WINDOWS-1251..UTF-8 {} \;

Btw I just recently learned how one can look for few, file extensions with find under one liner the argument to pass is -o -name '*.file-extension', as you can see from  example, you can look for as  many different file extensions as you like with one find search command.

After completing the convertion, I've remembered that earlier I've also used iconv on a couple of occasions to convert from Cyrillic CP-1251 to Cyrillic UTF-8, thus for those who prefer to complete convertion with iconv here is an alternative a bit longer method using for cycle + mv and iconv.

2. Mass file convertion with iconv

for i in $(find . -name "*.html" -print); do
iconv -f WINDOWS-1251 -t UTF-8 $i > $i.utf-8;
mv $i $i.bak;
mv $i.utf-8 $i;
done

As you see in above line of code, there are two occurances of move command as one is backupping all .html files and second mv overwrites with files with converted encoding. For any other files different from .html, just change in cmd find . -iname '*.html' to whatever file extension.

AEWAN – a nice advanced GNU / Linux console ASCII art text editor

Saturday, May 19th, 2012

I'm a guy fascinated by ASCII art, since the very early days I saw a piece of this awesome digital art.

As time passed and computers went to be used mostly  graphics resolution, ASCII art loose its huge popularity from the early DOS and BBS (internet primordial days).

However, this kind  of art is still higly valued by true computer geeks.
In that manner of thoughts, lately I'm researching widely on ASCII art tools and ASCII art open source tools available for Linux.
Last time I check what is available for 'ASCII job' was before 5 years time. Recently I decided to review once again and see if there are new software for doing ascii manipulations on Linux and this is how this article got born.

My attention was caught by aewan (ASCII-art Editor Without A Name), while searching for ASCII keyword description packages with:

apt-cache search ascii

Aewan project official website is on sourceforge check it out here

Here is the complete description of the Debian package:

hipo@noah:~$ apt-cache show aewan|grep -i description -A 5
Description: ASCII-art Editor Without A Name
aewan is an ASCII art editor with support for multiple layers that can be
edited individually, colors, rectangular copy and paste, and intelligent
horizontal and vertical flipping (converts '\' to '/', etc). It produces
both stand-alone art files and an easy-to-parse format for integration
into your terminal applications.

I installed it to give it a try:

noah:~# apt-get --yes install aewan
Selecting previously deselected package aewan.
(Reading database ... 388522 files and directories currently installed.)
Unpacking aewan (from .../aewan_1.0.01-3_amd64.deb) ...
Processing triggers for man-db ...
Setting up aewan (1.0.01-3) ...

aewan package provides three executable binaries:

noah:~# dpkg -L aewan|grep -i /bin/ /usr/bin/aecat
/usr/bin/aewan
/usr/bin/aemakeflic

1. aewan binary is the ascii-art editor itself

2. aecat is utility to display an aewan documents (aewan format saved files)3. aemakeflictool to produce an animation from an aewan document

Next I ran it in plain console tty  to check how it is like:

hipo@noah:~$ aewan

Below are screenshots to give you an idea how powerful aewan ASCII art editor is:

AEWAN ASCII art editor entry information screen Debian GNU / Linux shot

Aewan immediate entry screen after start up

Aewan ASCII art editor Linux showing the major functionality of aewan on Debian GNU / Linux Squeeze

Aewan ASCII art editor – all of the supported tool functions

As you can see from the shot the editor is very feature rich. I was stunned to find out it even supports layers (in ASCII!!) (w0w!). 
It even has a Layers Manager (like GIMP) 🙂

To create my first ASCII art I used the:

New

menu.

This however didn't immediately show the prompt, where I can type  the ascii characters to draw my picture. In order to be able to draw inside the editor, its necessary to open at least one layer, through using the menu:

Add Layer (defaults)

then the interactive ASCII art editor appeared.

While an ASCII art is created with the editor you can select the color of the input characters by using Drawing Color menu seen in the above screenshot.

aewan drawing color choose color Linux shot

I've played few minutes and created a sample ascii art, just to test the color and editor "look & feel", my conclusions are the editor chars drawing is awesome.

Aewan ascii art produced on my Debian GNU / Linux host

All the commands available via menus are also accessible via a shortcut key combinations:

Aewan Linux Ascii art editor quick key shortcut commands

aewan controls are just great and definitely over-shadows every other text editor I used to draw an ASCII art so far.
Once saved the ASCII art, are by default saved in a plain gzipped ascii text. You can therefore simply zcat the the saves;
Don't expect zcat to show you the ascii as they're displayed in aewan, zcat-ing it will instead  display just the stored meta data; the meta data is interpreted and displayed properly only with aecat command.

aewan aecat displaying properly previously saved ascii art picture

I've checked online for rpm builds too and such are available, so installing on Fedora, CentOS, SuSE etc. is up to downloading the right distro / hardware architecture rpm package and running:

# rpm -ivh aewan*.rpm

On the official website, there are also instructions to compile from source, Slackware users and users of other distros which doesn't have a package build should compile manually with the usual:

$ tar -zxf aewan-1.0.01.tar.gz
$ cd aewan-1.0.01
$ ./configure
$ make
$ su -c "make install"

For those inrested to make animations with aemakeflic you need to first save a multiple layers of pictures. The idea of creating ASCII art video is pretty much like the old school way to make animation "draw every scene" and movie it. Once all different scene layers of the ASCII art animation are prepared one could use  aemakeflic to export all the ASCII layers as common video.

aemakeflic has the ability to export the ASCII animation in a runnable shell script to display the animation. The other way aemakeflic can be used is to produce a picture in kind of text format showing the video whether seen with  less cmd.
Making ASCII animation takes a lot of time and effort. Since i'm too lazy and I lack the time I haven't tested this functionality. Anyways I've seen some ascii videos on telnet  to remote hosts (some past time); therefore I guess they were made using aewan and later animated with aemakeflic.

I will close this post with a nice colorful ASCII art, made with aewan (picture is taken from the project page):

Aewan Flipping Selection Screenshot