|
When using a subroutine call back for an instance of HTML::Parser the offset and offset_end atribs are about 1000 bytes too high. In the case of a 800 byte "myfile.foo" the following code returns "start" tag offsets of around 1840 and 1853 which are clearly too high:
my $Hparser = HTML::Parser->new(api_version => 3);
$Hparser->handler(start => sub{my($tag, $start, $end) = @_;
printf("%s starts at %d and ends at %d\n", $tag, $start, $end);},
"tagname, offset,offset_end");
$Hparser->parse_file("myfile.foo");
I got the same results by dumping the whole file into a single variable and parsing it. The file is ordinary one byte per character ASCII with 67 lines (CR/LFs).
Anybody seen this problem on current release W2k with ActiveState ActivePerl 5.8 and been able to resolve it?
Thanks for taking a look,
Pat Ampulla
|