Image-ExifTool - Re: Warning: Malformed UTF-8 character(s)

Posted on Tue May 15 20:43:00 2007 by exiftool in response to 5146 (See the whole thread of 10)
Re: Warning: Malformed UTF-8 character(s)
I'm glad it makes a bit more sense now.

When writing information, ExifTool uses the value of CodedCharacterSet to determine how to encode the text. If CodedCharacterSet is being written at the same time as text, the new character set is used. If no CodedCharacterSet exists and none is written, then Latin1 is assumed.

The special character handling in IPTC is a real mess. The way ExifTool originally handled it (by never translating) was simplest, but it seems that other applications most commonly assume Latin1 characters (contrary to the actual IPTC specification) so ExifTool was displaying special characters written by these applications incorrectly. This is the reason for the change.

If enough people have problems with this, I am open to changing it back again.

It is a pity that not many applications support UTF8 in IPTC, because this is the best solution. The original IPTC specification used ISO 2022, which is a real can of worms and hence isn't well supported either, but UTF8 support was added as a revision to the IPTC specification (I believe), and is a much better solution.

- Phil
Direct Responses: 5233 | Write a response