HTML-Lint reviews

cpanratings
 

RSS | Module Info | Add a review of HTML-Lint

HTML-Lint (2.06) **

I wanted to check some cruddy old HTML pages for errors, so I started looking on CPAN. HTML::Lint was the second thing I came upon, after one other thing which looked very difficult to install.

It did a few useful things, like catching closing tags without an equivalent opening, or informing about img tags with no height, width, or alt text. But this module has a big problem if your web page contains any UTF-8 encoded Unicode characters. If you use its "parse_file" method, not only does it insist that you have to use HTML entities:

Invalid character \xC2 should be written as Â
Invalid character \xA9 should be written as ©

but even worse it doesn't take any notice of your encoding anyway, and it tells you to use entities for each byte, which is wrong and will result in a broken web page. If you read the text in yourself as UTF-8 and send it to this module via its "parse" method, things get even worse since it doesn't have any way of coping with these inputs. When I tried using "parse" on a Unicode-encoded string, I got streams of errors like this:

Use of uninitialized value $val in substitution (s///) at /home/ben/software/install/lib/perl5/site_perl/5.10.0/HTML/Lint/Error.pm line 112.

I tried fiddling with the three error switches provided, but it turned out that switching off the entity part also switches off the other parts which were useful to me. I guess one could just send the output of this module through 'grep -v "Invalid character"' to remove these bogus errors.

Looking at the issue tracker for the module

code.google.com/p/html-lint/issues/list

it seems like the above problems have been reported already.

Ben Bullock - 2010-03-28T21:23:12 (permalink)

4 out of 5 found this review helpful. Was this review helpful to you?  Yes No