Posts Tagged ‘bak’

Remove string line from file on Linux and BSD – Delete entire line with string from file

Tuesday, March 15th, 2016

linux-remove-lines-containing-string-with-sed

If you're already used too using grep -v "sometring" filename to print everything from a file without the certain grepped string output and you want to do the same to delete lines based on strings without having to output the grepped string to a file and then overwritting the original file:
 

grep -v 'whatever' filename > filename1
mv filename1 filename


A much better way to delete an whole line containing a string match from a file is to use sed
sed
should be the tool of choice especially if you're scripting because sed is especially made for such batch edittings.

Here is how to do delete an entire line based on a given string:

 

sed –in-place '/some string to search and delete/d' myfilename


It might be a good idea to also create backups just to make sure something doesn't get deleted incidently to do use:

sed –in-place=.bak '/some string to search and delete/d' myfilename

If you need to wipe out an exact string from all files within a folder you might use a for loop or perl (some good examples check my previous article here)

In short to use bash's for loop here is how to backup and remove all lines with a string match within all files within a Linux directory:

 

for f in *.txt; do sed –in-place '/some string/d'
"$f"; done
find -name '*.txt' -exec sed –in-place=.bak '/some
string/d' "{}" ';'

 

BTW SED is really rich editor and some people got so much into it that there is even a sed written text (console) version of arkanoid 🙂

sed-text-editor-written-arkanoid-game-linux-bsd

If you want to break the ice and get some fun in your boring sysadmin life get sed arkanoid code from here.
I have it installed under pc-freak.net free ASCII Games entertainment service, so if you want to give it a try just login and give a try.

Enjoy 🙂

How to convert file content encoded in windows-cp1251 charset to UTF-8 (with iconv) to be delivered properly encoded to browsing end clients

Wednesday, May 16th, 2012

windows-cp1251 bulgarian to UTF-8 / Encoding Communication Decoding Communication Funny Picture

I have a bunch of old html files all encoded in the historically obsolete Windows-cp1251. Windows-CP1251 used to be common used 7 years ago and therefore still big portions of the web content in Bulgarian / Russian Cyrillic is still transferred to the end users in this encoding.

This was just before the "UTF-8 revolution", where massively people started using UTF-8,
Well it was clear the specific national country text encoding standards will quickly be moved by to UTF-8 – Universal Encoding format which abbreviation stands for (Unicode Transformation Format).

Though UTF-8 was clear to be "the future", many web developers mostly because of their incompetency or using an old sources of learning how to writen in HTML continued to use windows-cp1251 in HTMLs. I'm even convinced, there are still developers out there who are writting websites for Bulgarian / Russian / Macedonian customers using obsolete encodings …

The smarter developers of those accustomed to windows-cp1251, KOI-8R etc. etc., were using the meta tag to specify the type of charset of the web page content with:

<meta http-equiv="content-type" content="text/html;charset=windows-cp1251">

or

<meta http-equiv="content-type" content="text/html;charset=koi-8r">

Anyhow, still many devs even didn't placed the windows-cp1251 in the head of the HTML …

The result for the system administrator is always a mess – a lot of webpages that are showing like unreadable signs and tons of unhappy customers.
As always the system administrator is considered responsible, for the programmer mistakes :). So instead of programmers fix their bad cooking, the admin has to fix it all!

One quick work around me as admin has applied to failing to display pages in Cyrillic using the Windows-cp1251 character encoding was to force windows-cp1251 as a default encoding for the whole virtualhost or Apache directory with Apache directives like:

<VirtualHost *:80>
ServerAdmin some_user@some_host.com
DocumentRoot /var/www/html
AddDefaultCharset windows-cp1251
ServerName the_host_name.com
ServerAlias www.the_host_name.com
....
....
<Directory>
AddDefaultCharset windows-cp1251
>/Directory>
</VirtualHost>

Though this mostly would, work there are some occasions, where only a particular html files from all the content served by Apache is encoded in windows-cp1251, if most of the content is already written in UTF-8, this could be a big issues as you cannot just change the UTF-8 globally to windows-cp1251, just because few pages are written in archaic encoding….
Since most of the content is displayed to the client by Apache (as prior explained) just fine, only particular htmls lets's ay single.html, single2.html etc. etc. are displayed with some question marks or some non-human readable "hieroglyphs".

Below is a screenshot from two pages returned to my browser in wrongly set htmls charset:

Improper Windows CP1251 encoding with Apache set to serve UTF-8 encoding questiomarks

Improper Windows CP1251 delivered page in UTF-8 browser view

Apache returns cp1251 in some non-UTF8 wrong encoding (webserver improperly served cyrillic encoding)

Improperly served encoding CP1251 delivered by Apache in non-utf-8 encoding

When this kind of issues occur, the only solution is to simply login to the server and use iconv command to convert all files returning unreadable content from whatever the non UTF-8 encoding is lets say in my case Bulgarian typeset of cp1251 to UTF-8

Here is how the iconv command to convert between windows-cp1251 to utf-8 the two sample files named single1.html and single2.html

server:/web# /usr/bin/iconv -f WINDOWS-1251 -t UTF-8 single1.html > single1.html.utf8
server:/web# mv single1.html single1.html.bak;
server:/web# mv single1.html.utf8 single1.html
server:/web# /usr/bin/iconv -f WINDOWS-1251 -t UTF-8 single2.html > single2.html.utf8
server:/web# mv single2.html single2.html.bak;
server:/web# mv single2.html.utf8 single2.html

I always, make copies of the original cp1251 encoded files (as you see mv single1.html single1.html.bak), because if something goes wrong with convertion I can easily revert back.

If there are 10 files with consequential numbers naming they can be converted using a short for loop, like so:

server:/web# for i $(seq 1 10); do
/usr/bin/iconv -f WINDOWS-1251 -t UTF-8 single$i.html > single$i.html.utf8;mv single$i.html single$i.html.bak
mv single$i.html.utf8 single$i.html
done

Just as earlier mentioned if single1.html, single2.html … has in the html <head>:

<meta http-equiv="Content-Type" content="text/html; charset=windows-1251">

You should open, each of the files in question and wipe out the line either by hand or use sed to wipe it in one loop if it has to be done for lets say 10 files named (single{1..10})

server:/web# for i in $(seq 1 10); do
sed '/<meta http-equiv="Content-Type" content="text\/html; charset=windows-1251>/d' single$i.txt > single$i.txt.new;
mv single$i.txt single$i.txt.bak;
mv single$i.txt.new single$i.txt

Well now,

Auto restart Apache on High server load (bash shell script) – Fixing Apache server temporal overload issues

Saturday, March 24th, 2012

auto-restart-apache-on-high-load-bash-shell-script-fixing-apache-temporal-overload-issues

I've written a tiny script to check and restart, Apache if the server encounters, extremely high load avarage like for instance more than (>25). Below is an example of a server reaching a very high load avarage:;

server~:# uptime
13:46:59 up 2 days, 18:54, 1 user, load average: 58.09, 59.08, 60.05
load average: 0.09, 0.08, 0.08

Sometimes high load avarage is not a problem, as the server might have a very powerful hardware. A high load numbers is not always an indicator for a serious problems. Some 16 CPU dual core (2.18 Ghz) machine with 16GB of ram could probably work normally with a high load avarage like in the example. Anyhow as most servers are not so powerful having such a high load avarage, makes the machine hardly do its job routine.

In my specific, case one of our Debian Linux servers is periodically reaching to a very high load level numbers. When this happens the Apache webserver is often incapable to serve its incoming requests and starts lagging for clients. The only work-around is to stop the Apache server for a couple of seconds (10 or 20 seconds) and then start it again once the load avarage has dropped to less than "3".

If this temporary fix is not applied on time, the server load gets increased exponentially until all the server services (ssh, ftp … whatever) stop responding normally to requests and the server completely hangs …

Often this server overloads, are occuring at night time so I'm not logged in on the server and one such unexpected overload makes the server unreachable for hours.
To get around the sudden high periodic load avarage server increase, I've written a tiny bash script to monitor, the server load avarage and initiate an Apache server stop and start with a few seconds delay in between.

#!/bin/sh
# script to check server for extremely high load and restart Apache if the condition is matched
check=`cat /proc/loadavg | sed 's/\./ /' | awk '{print $1}'`
# define max load avarage when script is triggered
max_load='25'
# log file
high_load_log='/var/log/apache_high_load_restart.log';
# location of inidex.php to overwrite with temporary message
index_php_loc='/home/site/www/index.php';
# location to Apache init script
apache_init='/etc/init.d/apache2';
#
site_maintenance_msg="Site Maintenance in progress - We will be back online in a minute";
if [ $check -gt "$max_load" ]; then>
#25 is load average on 5 minutes
cp -rpf $index_php_loc $index_php_loc.bak_ap
echo "$site_maintenance_msg" > $index_php_loc
sleep 15;
if [ $check -gt "$max_load" ]; then
$apache_init stop
sleep 5;
$apache_init restart
echo "$(date) : Apache Restart due to excessive load | $check |" >> $high_load_log;
cp -rpf $index_php_loc.bak_ap $index_php_loc
fi
fi

The idea of the script is partially based on a forum thread – Auto Restart Apache on High Loadhttp://www.webhostingtalk.com/showthread.php?t=971304Here is a link to my restart_apache_on_high_load.sh script

The script is written in a way that it makes two "if" condition check ups, to assure 100% there is a constant high load avarage and not just a temporal 5 seconds load avarage jump. Once the first if is matched, the script first tries to reduce the server load by overwritting a the index.php, index.html script of the website with a one stating the server is ongoing a maintenance operations.
Temporary stopping the index page, often reduces the load in 10 seconds of time, so the second if case is not necessery at all. Sometimes, however this first "if" condition cannot decrease enough the load and the server load continues to stay too high, then the script second if comes to play and makes apache to be completely stopped via Apache init script do 2 secs delay and launch the apache server again.

The script also logs about, the load avarage encountered, while the server was overloaded and Apache webserver was restarted, so later I can check what time the server overload occured.
To make the script periodically run, I've scheduled the script to launch every 5 minutes as a cron job with the following cron:

# restart Apache if load is higher than 25
*/5 * * * * /usr/sbin/restart_apache_on_high_load.sh >/dev/null 2>&1

I have also another system which is running FreeBSD 7_2, which is having the same overload server problems as with the Linux host.
Copying the auto restart apache on high load script on FreeBSD didn't work out of the box. So I rewrote a little chunk of the script to make it running on the FreeBSD host. Hence, if you would like to auto restart Apache or any other service on FreeBSD server get /usr/sbin/restart_apache_on_high_load_freebsd.sh my script and set it on cron on your BSD.

This script is just a temporary work around, however as its obvious that the frequency of the high overload will be rising with time and we will need to buy new server hardware to solve permanently the issues, anyways, until this happens the script does a great job 🙂

I'm aware there is also alternative way to auto restart Apache webserver on high server loads through using monit utility for monitoring services on a Unix system. However as I didn't wanted to bother to run extra services in the background I decided to rather use the up presented script.

Interesting info to know is Apache module mod_overload exists – which can be used for checking load average. Using this module once load avarage is over a certain number apache can stop in its preforked processes current serving request, I've never tested it myself so I don't know how usable it is. As of time of writting it is in early stage version 0.2.2
If someone, have tried it and is happy with it on a busy hosting servers, please share with me if it is stable enough?

Using perl and sed to substitute strings in multiple files on Linux and BSD

Friday, August 26th, 2011

Using perl and sed to replace strings in files on Linux, FreeBSD, OpenBSD, NetBSD and other UnixOn many occasions when had to administer on Linux, BSD, SunOS or any other *nix, there is a need to substitute strings inside files or group of files containing a certain string with another one.

The task is not too complex and many of the senior sysadmins out there would certainly already has faced this requirement and probably had a good idea on files substitution with perl and sed, however I’m quite sure there are dozen of system administrators out there who did not know, how and still haven’t faced a situation where there i a requirement to substitute from a command shell or via a scripting language.

This article tagets exactly these system administrators who are not 100% sys op Gurus 😉

1. Substitute text strings inside files on Linux and BSD with perl

Perl programming language has originally been created to do a lot of text manipulation as well as most of the Linux / Unix based hosts today have installed working copy of perl , therefore using perl as a mean to substitute one string in a file to another one is maybe the best way to completet the task.
Another good thing about perl is that text processing with it is said to be in most cases a bit faster than sed .
However it is still dependent on the string to be substituted I haven’t done benchmark tests to positively say 100% that always perl is quicker, however my common sense suggests perl will be quicker.

Now enough talk here is a very simple way to substitute a reoccuring, text string inside a file with another chosen one is like so:

debian:~# perl -pi -e 's/foo/bar/g' file1 file2

This will substitute the string foo with bar everywhere it’s matched in file1 and file2

However the above code is a bit “dangerous” as it does not preserve a backup copy of the original files, where string is substituted is not made.
Therefore using the above command should only be used where one is 100% sure about the string changes to be made.

Hence a better idea whether conducting the text substitution is to keep also the original file backup under a let’s say .bak extension. To achieve that I use perl as follows:

freebsd# perl -i.bak -p -e 's/syzdarma/magdanoz/g;' file1 file2

This command creates copies of the original files file1 and file2 under the names file1.bak and file2.bak , the files file1 and file2 text occurance of strings syzdarma will get substituted with magdanoz using the option /g which means – (substitute globally).

2. Substitute string in all files inside directory using perl on Linux and BSD

Every now and then the there is a need to do manipulations with large amounts of files, I can’t right now remember a good scenario where I had to change all occuring matching strings to anther one to all files located inside a directory, anyhow I’ve done this on a number of occasions.

A good way to do a mass file string substitution on Linux and BSD hosts equipped with a bash shell is via the commands:

debian:/root/textfiles:# for i in $(echo *.txt); do perl -i.bak -p -e 's/old_string/new_string/g;' $i; done

Where the text files had the default txt file extension .txt

Above bash loop prints each of the files located in /root/textfiles and substitutes everywhere (globally) the old_string with new_string .

Another alternative to the above example to replace multiple occuring text string in all files in multiple directories is possible using a combination of shell commands grep, perl, sort, uniq and xargs .
Let’s say that one wants to match everywhere inside the root directory and all the descendant directories for files with a custom string and substitute it to another one, this can be done with the cmd:

debian:~# grep -R -files-with-matches 'old_string' / | sort | uniq | xargs perl -pi~ -e 's/old_string/new_string/g'

This command will lookup for string old_string in all files in the / – root directory and in case of occurance will substitute with new_string (This command’s idea was borrowed as an idea from http://linuxadmin.org so thx.).

Using the combination of 5 commands, however is not very wise in terms of efficiency.

Therefore to save some system resources, its better in terms of efficiency to take advantage of the find command in combination with xargs , here is how:

debian:~# find / | xargs grep 'old_string' -sl |uniq | xargs perl -pi~ -e 's/old_string/new_string/g'

Once again the find command example will do exactly the same as the substitute method with grep -R …

As enough is said about the way to substitute text strings inside files using perl, I will further explain how text strings can be substituted using sed

The main reason why using sed could be a better choice in some cases is that Unices are not equipped by default with perl interpreter. In general the amount of servers who contains installed sed compared to the ones with perl language interpreter is surely higher.

3. Substitute text strings inside files on Linux and BSD with sed stream editor

In many occasions, wether a website is hosted, one needs to quickly conduct a change in string inside all files located in a directory, to resolve issues with static urls directly encoded in html.
To achieve this task here is a code using two little bash script loops in conjunctions with sed, echo and mv commands:

debian:/var/www/website# for i in $(ls -1); do cat $i |sed -e "s#index.htm#http://www.webdomain.com/#g">$i.new; done
debian:/var/www/website# for i in $(ls *.new); do mv $i $(echo $i |sed -e "s#.new##g"); done

The above command sed -e “s#index.htm#http://www.webdomain.com/#g”, instructs sed to substitute all appearance of the text string index.htm to the new text string http://www.webdomain.com

First for bash loop, creates all the files with substituted string to file1.new, file2.new, file3.new etc.
The second for loop uses mv to overwrite the original input files file1, file2, file3, etc. with the newly created ones file1.new, file2.new, file3.new

There is a a way shorter way to conclude the same text substitutions task using a simpler one liner with only using sed and bash’s eval capabilities, here is how:

debian:/var/www/website# sed -i 's/old_string/new_string/g' *

Above command will change old_string to new_string inside all files in directory /var/www/website

Whether a change has to be made with less than 1024 files using this method might be more efficient, however whether a text substitute has to be done to let’s say 5000+ the above simplistic version will not work. An error of Argument list too long will prevent the sed -i ‘s/old_string/new_string/g’ to complete its task.

The above for loop 2 liner should be also working without problems with FreeBSD and the rest of BSD derivatives, though I have not tested it yet, hence any feedback from FreeBSD guys is mostly welcome.

Consider that in order to have the for loops commands work on FreeBSD or NetBSD, they have to be run under a bash shell.
That’s all folks thanks the Lord for letting me write this nice article, I hope it gives some insights on how multiple files text replace on Unix works .
Cheers 😉