faith, computing, news, diary, journal,
whatever
Sun Mar 18 19:23:01 EET 2012
How to disable tidy HTML corrector and validator
to output error and warning messages
I've noticed in
/var/log/apache2/error.log on one of the Debian
servers I manage a lot of warnings and errors produced by tidy -
HTML syntax checker and reformatter program.
There were actually quite plenty frequently appearing messages in
the the log like:
...
To learn more about HTML Tidy see http://tidy.sourceforge.net
Please fill bug reports and queries using the "tracker" on the Tidy
web site.
Additionally, questions can be sent to html-tidy@w3.org
HTML and CSS specifications are available from
http://www.w3.org/
Lobby your company to join W3C, see
http://www.w3.org/Consortium
line 1 column 1 - Warning: missing <!DOCTYPE>
declaration
line 1 column 1 - Warning: plain text isn't allowed in <head>
elements
line 1 column 1 - Info: <head> previously mentioned
line 1 column 1 - Warning: inserting implicit <body>
line 1 column 1 - Warning: inserting missing 'title' element
Info: Document content looks like HTML 3.2
4 warnings, 0 errors were found!
...
I did a quick investigation on where from this messages are logged
in error.log, and discovered few .php scripts in one of the
websites containing the tidy string.
I used Linux find + grep cmds find in all php files the "tidy
"string, like so:
server:~# find . -iname '*.php'|grep -rli 'tidy' '{}'
\;
find . -iname '*.php' -exec grep -rli 'tidy' '{}' \;
./new_design/modules/index.mod.php
./modules/index.mod.php
./modules/index_1.mod.php
./modules/index1.mod.php
Opening the files, with vim to check about how tidy is invoked,
revealed tidy calls like:
exec('/usr/bin/tidy -e -ashtml -utf8
'.$tmp_name,$rett);
As you see the PHP programmers who wrote this website, made a
bigtidy mess. Instead of using php5's tidy module, they hard
coded tidy external command to be invoked via php's exec();
external tidy command invocation.
This is extremely bad practice, since it spawns the command via a
pseudo limited apache shell.
I've notified about the issue, but I don't know when, the external
tidy calls will be rewritten.
Until the external tidy invocations are rewritten to use the php
tidy module, I decided to at least remove the tidy warnings and
errors output.
To remove the warning and error messages I've changed:
exec('/usr/bin/tidy -e -ashtml -utf8
'.$tmp_name,$rett);
exec('/usr/bin/tidy --show-warnings no --show-errors no -q -e
-ashtml -utf8 '.$tmp_name,$rett);
The extra switches meaning is like so:
q - instructs tidy to produce quiet output
-e - show only errors and warnings
--show warnings no && --show errors no, completely
disable warnings and error output
Onwards tidy no longer logs junk messages in error.log Not
logging all this useless warnings and errors has positive effect on
overall server performance especially, when the scripts, running
/usr/bin/tidy are called as frequently as 1000 times per
sec. or more