PDF-API2 - Garbled Text

Posted on Thu Mar 2 21:01:49 2006 by gnurob
Garbled Text
Alfred, thanks for sharing the PDF::API2 module. It has enabled me to audit more than 10,000 PDF documents without breaking a sweat.

I did run into one problem that I can't lick. Approximately one out of ten PDF documents contain wide characters that $pdf-info() returns as "junk_text" rather than "(OHS).PDF" as seen in Acrobat 6.0, Document Properties, Description. This also happens with encrypted documents...

The CPAN::Forum would not accept the remainder of this post. Here's some PDF examples.

www.gs.gov.nl.ca/ohs/pdf/ann-rep-whsi.pdf (garbled) www.gs.gov.nl.ca/cca/cr/pdf/coop/coop21-art-dis.pdf (works) www.gs.gov.nl.ca/misc/data/gazette/wk/2006-01-13.pdf (garbled, encrypted file)

Can PDF-API2 process unicode characters in meta info?

Thanks, Rob
Direct Responses: 1893 | Write a response