Need help sorting out how to use 23? Or have you discovered one of those nasty bugs?

Issue with non-Latin characters in metadata

Pierre-Jean Alet   February 20, 2011, 09:47 PM

Hi,

There are non-Latin characters in the captions of my pictures, stored as metadata. Unfortunately, they don't appear correctly when I upload the pictures.
I have tried to store the caption in the XMP segment, the IPTC segment (in Caption-Abstract) and the EXIF segment (in Description). In all three cases, the caption is detected when I upload the picture, but non-Latin characters never show correctly. However, there isn't any problem when I read the metadata with exiftool on my computer (which runs Mac OS 10.5.8).
I am aware of other threads where the metadata where incorrectly written by another software programme. In my case, they were written with a script using exiftool, and the encoding is specified at UTF8 (in IPTC:CodedCharacterSet).

Thanks in advance for your help.

 
boegh   February 21, 2011, 09:36 AM

You should be aware of the following from the exiftool FAQ:

Most textual information in EXIF is stored in ASCII format, and ExifTool does not convert these tags. However it is not uncommon for applications to write UTF‑8 or other encodings where ASCII is expected, and ExifTool will quite happily read/write any encoding without conversion. For a few EXIF tags (UserComment, GPSProcessingMethod and GPSAreaInformation) the stored text may be encoded either in ASCII, Unicode (UCS-2) or JIS.

If your editing application writes UTF-8 data in fields where ASCII is not expected, exiftool will *not* convert these to ASCII where it is expected as per the EXIF 2.3 standard. As such using exiftool is no guarantee for correctly formatted exif-data.

Also keep in mind, that the default preference for reading metadata usually are EXIF, IPTC and then XMP (I have no idea in which order 23hq handles these), so a wrongfully encoded exifdata will usually show even though there are correctly encoded IPTC or XMP.

 
Steffen Fagerström Christensen Team 23   February 22, 2011, 05:18 PM

Hey guys -- and thank you for the reply, Henrik.

We are actually using exittool on our end to extract the data as well, so it might be simple a matter of ensuring some encoding stuff in our code to solve this problem. Pierre-Jean, can you send an original file exhibiting the problem to my email at steffen@23hq.com?

Thanks,
Steffen

 
Pierre-Jean Alet   February 22, 2011, 10:54 PM

Steffen, Henrik,

Thanks a lot for your replies. Steffen, I've just sent you an e-mail.

By the way, I find the picture metadata quite confusing, with lots of duplicates between the segments. Do you have any recommendations about the fields to use, in particular to ensure future compatibility?

Cheers,


Pierre-Jean

 
Steffen Fagerström Christensen Team 23   February 27, 2011, 01:42 PM

Pierre-Jean,

Exif and utf-8 is apparent a pretty tricky proposition -- but we've made a few changes on our end that should means that non-latin chars from EXIF are displayed correctly after upload.

 
Pierre-Jean Alet   February 28, 2011, 01:00 PM

Steffen,

That works perfectly now with my pictures (caption stored under IPTC:Caption-Abstract, with IPTC:CodedCharacterSet set to UTF8).
Thanks a lot for your quick reaction!

Cheers,

Pierre-Jean

 




About 23

About 23
What is 23 and who's behind the service?
Just In
Discover the world from a different angle.
Here's a crop of the latest photos from the around the world.
Search
Search photos from users using 23
Help / Discussion
Get help or share your ideas to make 23 better
23 Blog / 23 on Twitter
Messages and observations from Team 23
Terms of use
What can 23 be used for and what isn't allowed
More services from 23
We also help people use photo sharing in their professional lives
RSS Feed
Subscribe to these photos in an RSS reader
  • Basque (ES)
  • Bulgarian (BG)
  • Chinese (CN)
  • Chinese (TW)
  • Danish (DK)
  • Dutch (NL)
  • English (US)
  • French (FR)
  • Galician (ES)
  • German (DE)
  • Italian (IT)
  • Norwegian (NO)
  • Polish (PL)
  • Portuguese (PT)
  • Russian (RU)
  • Spanish (ES)
  • Swedish (SE)

Popular photos right now