Easy to use module, even though the interface and documentation are somewhat unusual for today's day and age.
Since it builds an actual tree, the module is on the slow side (1000 lines or 256k of HTML from bit.ly/html_tree_benchmark_html on a 2GHz CPU), so if you can remove any unneeded parts before parsing, that will help. For example, I was parsing a large HTML table and removing all the attributes of the TD elements, all formatting <spans> and all <NOBR>s around numbers, I was able to speed up the parsing 2x.
Also lacks XPath, but there is a separate HTML::TreeBuilder::XPath module.
This is a really great module for doing HTML processing. I worked with it for a project for a friend, and then as part of WWW::Search, and I found it extremely convenient. It gets the job done quickly.
The documentation is very good. If you find this module useful, you should also look at HTML::TreeBuilder::XPath which extends the power of HTML::TreeBuilder with XPath expressions.
The documentation layout is unusual, but the documentation itself is quite good. If you need to muck about with HTML, this is the place to start.
This has been my HTML handling module of the month; I dragged it out on four separate occasions to get something done, and it hasn't let me down. It's by far the easiest way I've found of manipulating HTML documents.