HTML-TableParser reviews

cpanratings
 

RSS | Module Info | Add a review of HTML-TableParser

HTML-TableParser (0.38) ****

Works well out of the box, extracting the text in table cells.

However, there's no way to extract additional HTML information from TD elements, for instance links or img src attributes. For that, use HTML::TableExtract.

HTML-TableParser (0.38) ****

- doesnt handle "flipped" tables (th's on rows not columns)

- MultiMatch should be the default

HTML-TableParser (0.38) *****

About as close as possible to a perfect Do-What-I-Mean solution to an otherwise painful, complicated, process.

HTML-TableParser (0.34) ****

this is a good module to retrive data in HTML table~~~
the following is a sample program to get weather forecast from
tw.weather.yahoo.com/tomorrow.html
and print the result to stdout.

use LWP::UserAgent;
use HTTP::Request::Common;
use HTML::TableParser;

#
# get web page tw.weather.yahoo.com/tomorrow.html
#

$ua = LWP::UserAgent->new;

$res = $ua->request(GET 'tw.weather.yahoo.com/tomorrow.html');

@content = split "\n", $res->as_string;

foreach (@content){

if(/^(民國.*)/){

print $_, "\n\n";

}
}

#
# reference
# search.cpan.org/~djerius/HTML-TablePa...
#

@reqs = (

{

id => 5.1, # id for embedded table

row => \&row, # function callback

}

);

# create parser object

$p = HTML::TableParser->new( \@reqs,

{ Decode => 0, Trim => 0, Chomp => 0 } );

$p->parse($res->as_string);

# function callbacks

sub row {

my ( $id, $line, $cols, $udata ) = @_;

print join "\t", @$cols;

print "\n";

}