| Module Info
| Add a review of HTML-Diff
This module CAN be handy. I use it to color a diff between versions of html snippets in a CMS. Unfortunately, it uses a string LCS (see Algorithm::Diff) approach that is not handy and sometimes buggy when it comes to HTML tag attribute changes:
html_word_diff('<p class="foo">Yo!</p>', '<p class="bar">Boo!</p>');
completely ignores the class attribute change:
["u", "<p class=\"foo\">", "<p class=\"bar\">"],
["c", "Yo!", "Boo!"],
["u", "</p>", "</p>"],
But all-in-all it's better than nothing if those "changesets" need not be ultra-reliable. If you feed simple, valid and pretty-printed XHTML to it, the results are quite good and very very easy to process: I simply put background-colored spans around changes and join it all together. Sometimes I have seen html_word_diff breaking inside of tags - this makes my simple approach render invalid XHTML. Yikes!
To be honest: I later tried with XML::Diff which is perfect for the task by means of correctness but it took way longer to get it the same thing working - HTML::Diff worked "quite well but imperfect" in a couple of minutes.
Just one sentence: it uses regular expressions for parsing HTML. It cannot be any reliable.