Posts Tagged ‘conversion’

Create PNG, JPG, GIF pictures / images from PDF on Linux

Saturday, February 25th, 2012

I've received a PDF file with a plan for development of a bundle of projects, My task was to evaluate this plan and give feeback on the 44 pages PDF document.

Since don't know of program to directly be able edit PDF files on GNU / Linux ?, my initial idea was to open and convert the PDF to ODT / DOC with OpenOffice (Libre Office) and then edit the ODT file.
Unfortunately Open Office oowrite program was unable to open / visualize the PDF file. My assumption is OO failure to open the PDF is because the PDF was generated on Microsoft Windows with Adobe illustrator or smth.

The idea that came to my mind as alternative, way to edit the PDF file was to convert it in pictures edit and then convert the pictures to PDF.
In other words to follow these 3 steps:
1. Convert the PDF document to multiple images
2. Edit each of the images with GIMP or Inkscape
3. Convert back all images to a single PDF file

Some time ago, I've written an article how to create PDF file from many image files in JPEG, PNG or GIF on Linux

. This prior article was exactly describing how to complete Step 3.Therefore all left was to find a way to convert the PDF file to multiple JPEG / PNG / GIF images.

convert command to convert PDF document to multiple pictures which you can take from my earlier article is:

$ convert *.jpg outputpdffile.pdf
Actually in Step 1 I was aiming to do the opposite of what I've previously done.

Hence, in order to convert the singe Project.PDF file to multiple PNG images, I just switched convert IN / OUT arguments order.

hipo@noah:~/project-pdf-to-images$ convert Project.pdf Project.png
...

I've done the PDF to pictures conversion on my notebook running Debian Squeeze (6.0.2) GNU / Linux.Convertion of the PDF file to 44 images, took 25 seconds on my dual core 1.8 Ghz / 2GB RAM Thinkpad r61.
Afterwards, I've had at hand 44 PNG files generated, e.g.:

hipo@noah:~/project-pdf-to-images$ ls -al Project-*.png |wc -l
44

convert was also smart enough to produce correct file naming. The output file names were:
Project-1.png
Project-2.png
etc.

Nicely each number (-1.png) was corresponding to the respective PDF page. For instance Project-10.png was corresponding to page 10 of the Projects.PDF file

Rather ironically, after convertion of the PDF to pictures, while opening the Project-1.png, I've noticed The GIMP – (The GNU Image Manipulation Program) is capable of directly reading PDF files. GIMP has both the option to open files in layers or separate images 😉
Anyways even if GIMP is used to modify the different PDF pages as layers, once completed GIMP doesn't have the ability to save the file as PDF and therefore once saved the file if merging of layers is done the resulting picture becomes ONE BIG MESS.
Therefore it seems my the 3 steps way e.g.:

1. convertion PDF to pictures
2. picture edit with GIMP or Inkscape
3. convertion of pictures back to PDF

is still the only way to "modify PDF" in Linux or BSDs. I will be glad to hear if someone has come up with a better solution?

 

How to convert AVI, MP4, FLV (flash video) and other non-free video encoded formats to Free Video format encoding OGV (Ogg Vorbis / Theora) on GNU / Linux and FreeBSD

Thursday, November 17th, 2011

Ogg Vorbis Free / Open Audio Video Format logo

I was looking for a way to convert some Video and Sound files, downloaded from Youtube (mostly things dedicated to free software) and as far as I looked online unfortunately these pieces of nice music and tutorials are not available for download anywhere else or at least not available for download in some of the Open / Free Format (OGG Vorbis or OGV (OGG / Theora Video).

When it comes to convertion between different formats, always the first things that I think of is ffmpeg or mencoder , however I was not sure if some of this tools are doing the trick so I did a quick research online if there is some specialised console or GUI program that can do the convertions between MP4, FLV etc. to OGV.

In less than 10 minutes I found a threat mentioning about ffmpeg2theoraA Simple Convertor to create Ogg Theora files

As I’m running Debian GNU / Linux, I installed ffmpeg2theora straight via apt, according to some reports online ffmpeg2theora cmd convertion tool is also available straight from repositories on Ubuntu as well.
On FreeBSD there is a port /usr/ports/multimedia/ffmpeg2theora available for install. Of course FFmpeg2Theora can be installed from source on other Linux distributions that might be missing a pre-built binary.

Using ffmpeg2theora to convert some kind of non-free video format is very simple, though the tool provides quite a numerous options for all those who want to have some customization for the video to be converted.
To convert the flash file “The Gnu Song.flv” for example to The Gnu Song.flv , I invoked ffmpeg2theora like this:

debian:~# ffmpeg2theora "The Gnu Song.flv"
...

The conversion took few minutes of time, as my machine is not ultra powerful and apparently the conversion to OGV format is not too quick but the good news is it works.
After the conversion was completed I used ogginfo to check the information about the recent converted file The Gnu Song.flv , below you see the file info ogginfo returns

debian:~# ogginfo The Gnu Song.ogv
Processing file "The Gnu Song.ogv"...

New logical stream (#1, serial: 5d65413f): type skeleton
New logical stream (#2, serial: 0570412d): type theora
New logical stream (#3, serial: 7e679651): type vorbis
Theora headers parsed for stream 2, information follows…
Version: 3.2.1
Vendor: Xiph.Org libtheora 1.1 20090822 (Thusnelda)
Width: 320
Height: 240
Total image: 320 by 240, crop offset (0, 0)
Framerate 25/1 (25.00 fps)
Aspect ratio undefined
Colourspace: Rec. ITU-R BT.470-6 Systems B and G (PAL)
Pixel format 4:2:0
Target bitrate: 0 kbps
Nominal quality setting (0-63): 32
User comments section follows…
ENCODER=ffmpeg2theora-0.24
Vorbis headers parsed for stream 3, information follows…
Version: 0
Vendor: Xiph.Org libVorbis I 20101101 (Schaufenugget)
Channels: 1
Rate: 22050
Nominal bitrate: 30.444000 kb/s
Upper bitrate not set
Lower bitrate not set
User comments section follows…
ENCODER=ffmpeg2theora-0.24
Logical stream 1 ended
Theora stream 2:
Total data length: 1525324 bytes
Playback length: 2m:41.360s
Average bitrate: 75.623401 kb/s
Logical stream 2 ended
Vorbis stream 3:
Total data length: 646729 bytes
Playback length: 2m:41.384s
Average bitrate: 32.059041 kb/s

ogginfo is a part of a package installed under the name vorbis-tools, vorbis tools also contains a few other helpful tools, whether operations with OGV or OGG file formats are at hand, the complete binaries vorbis-tools contains on Debian as of time of writting this post is:

/usr/bin/ogg123
/usr/bin/oggenc
/usr/bin/oggdec
/usr/bin/ogginfo
/usr/bin/vcut
/usr/bin/vorbiscomment
/usr/bin/vorbistagedit

ogg123 is a player for ogg files, however as far as I’ve tested it it doesn’t work too well. And just to compare ogg audio files were played just nice using the play command.
oggenc is used to encode ogg audio file, based on a stream haneded to it from other audio encoded stream (let’s say mp3). Hence oggenc can be used to convert mp3 files to ogg audio files , like so:

debian:~# mpg321 input.mp3 -w - | oggenc -o output.ogg -

oggdec is used to convert to wav files or raw PCM audio, whether;
vcut is used to cut ogg video file on parts.
vorbiscomment and vorbistagedit is used to edit information on already existing ogg audio files

There is also a GUI programmer for people who doesn’t want to bother with writting on the command line called oggconvert . OggConvert is written for GNOME and uses GTK gnome library, here is how the program looks like:

OggConvert GUI Program to convert to OGG og OGV Theora on GNU / Linux and FreeBSD

 

How to convert UTF-8 encoding files to Windows CP1251 on GNU / Linux

Friday, October 21st, 2011

I needed to convert a file which had a Bulgarian text written in UTF-8 encoding to Windows CP1251 in order to fix a website encoding problems after a move of the website from one physical server to another.

I tried first with enca( detects and convert encoding of text files from one encoding to another).

The exact way I tried to convert was:

linux:~# enca -L bg /home/site/www/includes/utf8_encoded_file.php
...
Unfortunately this attempt to conver was unsucesfully, and the second logical guess was to use iconvConvert encoding of given files from one encoding to another to do the utf8 to cp1251 conversion.
I reached for some help in irc.freenode.net, #varnalab channel and Alex Kuklin helped me, giving me an example command line to do the conversion.
iconv winedows to cp1251 conversion line, he pointed to me was:

linux:~# iconv -f utf8 -t cp1251 < in > out

Further on I adapted Alex’s example to convert my utf8_encoded_file.php encoded Bulgarian characted to CP1251 and used the following commands to convert and create backups of my original UTF8 file:

linux:~# cd /home/site/www/includes
linux:/home/site/www/includes# iconv -f utf8 -t cp1251 < utf8_encoded_file.php in > utf8_encoded_file.php.cp1251
linux:/home/site/www/includes# mv utf8_encoded_file.php utf8_encoded_file.php.bak
linux:/home/site/www/includes# mv utf8_encoded_file.php.cp1251 utf8_encoded_file.php

How to convert any internet Webpage to PDF from command line on GNU/Linux

Friday, September 30th, 2011

Linux webpage html to pdf command line convertor wkhtmltopdf

If you're looking for a command line utility to generate PDF file out of any webpage located online you are looking for Wkhtmltopdf
The conversion of webpages to PDF by the tool is done using Apple's Webkit open source render.
wkhtmltopdf is something very useful for web developers, as some webpages has a requirement to produce dynamically pdfs from a remote website locations.
wkhtmltopdf is shipped with Debian Squeeze 6 and latest Ubuntu Linux versions and still not entered in Fedora and CentOS repositories.

To use wkhtmltopdf on Debian / Ubuntu distros install it via apt;

linux:~# apt-get install wkhtmltodpf
...

Next to convert a webpage of choice use cmd:

linux:~$ wkhtmltopdf www.pc-freak.net pc-freak.net_website.pdf
Loading page (1/2)
Printing pages (2/2)
Done

If the web page to be snapshotted in long few pages a few pages PDF will be generated by wkhtmltopdf
wkhtmltopdf also supports to create the website snapshot with a specified orientation Landscape / Portrait

-O Portrait options to it, like so:

linux:~$ wkhtmltopdf -O Portrait www.pc-freak.net pc-freak.net_website.pdf

wkhtmltopdf has many useful options, here are some of them:
 

  • Javascript disabling – Disable support for javascript for a website
  • Grayscale pdf generation – Generates PDf in Grayscale
  • Low quality pdf generation – Useful to shrink the output size of generated pdf size
  • Set PDF page size – (A4, Letter etc.)
  • Add zoom to the generated pdf content
  • Support for password HTTP authentication
  • Support to use the tool over a proxy
  • Generation of Table of Content based on titles (only in static version)
  • Adding of Header and Footers (only in static version)

To generate an A4 page with wkhtmltopdf:

wkhtmltopdf -s A4 www.pc-freak.net/blog/ pc-freak.net_blog.pdf

wkhtmltopdf looks promising but seems a bit buggy still, here is what happened when I tried to create a pdf without setting an A4 page formatting:

linux:$ wkhtmltopdf www.pc-freak.net/blog/ pc-freak.net_blog.pdf
Loading page (1/2)
OpenOffice path before fixup is '/usr/lib/openoffice' ] 71%
OpenOffice path is '/usr/lib/openoffice'
OpenOffice path before fixup is '/usr/lib/openoffice'
OpenOffice path is '/usr/lib/openoffice'
** (:12057): DEBUG: NP_Initialize
** (:12057): DEBUG: NP_Initialize succeeded
** (:12057): DEBUG: NP_Initialize
** (:12057): DEBUG: NP_Initialize succeeded
** (:12057): DEBUG: NP_Initialize
** (:12057): DEBUG: NP_Initialize succeeded
** (:12057): DEBUG: NP_Initialize
** (:12057): DEBUG: NP_Initialize succeeded
Printing pages (2/2)
Done
Printing pages (2/2)
Segmentation fault

Debian and Ubuntu version of wkhtmltopdf does not support TOC generation and Adding headers and footers, to support it one has to download and install the static version of wkhtmltopdf
Using the static version of the tool is also the only option for anyone on Fedora or any other RPM based Linux distro.

How to convert Ogg Video (.ogv) to Flash video (.flv) on Linux and FreeBSD

Thursday, September 29th, 2011

ffmpeg is the de-facto standard for Video conversion on Linux and BSD platforms. I was more than happy to find out that ffmpeg is capable of converting an .ogv file format to .flv (Flash compressed Video).
Ogg Vorbis Video to Flash’s conversion on Linux is a real piece of cake with ffmpeg .
Here is how to convert .ogv to .flv:

debian:~# ffmpeg -i ogg_vorbis_video_to_convert_.ogv converted_ogg_vorbis_video_to_flash_video.flv
...

Conversion of a 14MB ogg vorbis video to flv took 28 seconds, the newly produced converted_ogg_vorbis_video_to_flash_video.flv has been reduced to a size of 9MB. This is on a system with 2 GB of memory and dual core 1.8 Ghz intel CPU.

How to fix problems with encoding not showing umlauts in after import of sql data to MySQL

Thursday, October 1st, 2009

I’m restoring some websites from backups this days. One of the swiss websites had a serious problem with umlauts not showing up.
This happened right after I’ve used an old dump from a MySQL Server running version 4.x, the imported data was to MySQL server version 5. The problem consisted in that everywhere an umault was placed the shown content was ü.

You can imagine how annoying and ugly that looked, the whole text was crappy.
After some googling with a help of one of my colleagues (a programmer). I was pointed to this nice article Mysql Latin1 Utf8 Conversion .
What happens is that for some reason the dump I’ve made had latin1 character-set even though the data inside was in utf8.
Thus importing the dump would try to import the data as latin1 and make a crap out of it. The fix is as simple as substituting latin8 to utf8 in your mysql dump file and then reimporting it again.
In my case the browser displayed by default the website characters in iso8859 instead of utf8, so I had to specificly to change the browser encoding to UTF8 to realize all is okay.
Then it was necessery to modify all the templates to use UTF8 instead of the wrong character encoding. I have no clue how does it happened that the same umlaut encoding on the old server, what I suspect is there was something with the Apache’s default character encoding probably I have it set there by default set to utf8.
Well so far so good, let’s see how much trashy stuff I have to deal with today.
END—–