Hi Mark,
XMP supports special characters beyond the standard ASCII using
UTF-8 encoding. The problem is on the IPTC side: IPTC character
encoding is fantastically obscure, and not well implemented by
other software. Even Photoshop does not adhere
to the IPTC specification, and will write Latin1 characters ad-hoc
in IPTC without properly setting the CodedChararacterSet
tag.
For this reason it is very difficult to properly handle special characters
in IPTC.
Also, I don't have a very good test set of IPTC containing special
characters from other applications, so it is difficult for me to know
what the best way to handle this is. Can you tell me what encoding
is used in your IPTC samples that contain special characters, and
what the CodedCharacterSet tag is set to?
According to
this source,
it may be sufficient in most cases to just assume Latin-1 encoding if not
specified. If this is true, I could add an option which would force
ExifTool to assume Latin1 encoding and convert appropriately.
If anyone has any ideas on this matter, I'd love to hear them.
- Phil
(4)
]
