| Module Info
| Add a review of HTML-TagFilter
Although it's quite old, the module installed without problems.
The default behaviour seems quite zany. The default setup converts comments in the HTML into escaped entities, like this:
& lt;!-- comment --& gt;
If I added this to the object creation:
my $htf = HTML::TagFilter->new (
strip_comments => 1,
then it would correctly strip out the comments, so it must be recognising them as comments, and yet it does this zany conversion which results in visible things appearing which were meant to be HTML comments. I don't see anywhere in the documentation where it explains the rationale for that, and it just seems like a bug to me.
& lt; !DOCTYPE HTML& gt;
This module probably didn't work correctly at the time of its most recent release, in 2005, and it cannot be recommended in 2017. I suggest trying out HTML::Restrict, HTML::Scrubber, or HTML::Strip instead.
(There is a bug in cpan ratings where it is failing to convert & amp; into & correctly, so excuse me if this text becomes incomprehensible after repeated edits. To work around the cpan ratings issue I have used a space in the above.)
For a list of similar modules and links to other reviews, please see my page at www.lemoda.net/perl/html-cleanup-modu...