This module CAN be handy. I use it to color a diff between versions of html snippets in a CMS. Unfortunately, it uses a string LCS (see Algorithm::Diff) approach that is not handy and sometimes buggy when it comes to HTML tag attribute changes:

html_word_diff('<p class="foo">Yo!</p>', '<p class="bar">Boo!</p>');

completely ignores the class attribute change:


["u", "<p class=\"foo\">", "<p class=\"bar\">"],

["c", "Yo!", "Boo!"],

["u", "</p>", "</p>"],

But all-in-all it's better than nothing if those "changesets" need not be ultra-reliable. If you feed simple, valid and pretty-printed XHTML to it, the results are quite good and very very easy to process: I simply put background-colored spans around changes and join it all together. Sometimes I have seen html_word_diff breaking inside of tags - this makes my simple approach render invalid XHTML. Yikes!

To be honest: I later tried with XML::Diff which is perfect for the task by means of correctness but it took way longer to get it the same thing working - HTML::Diff worked "quite well but imperfect" in a couple of minutes.
