About a year ago, my daughter scanned nearly 600 old family photographs, adding captions to them as she went. This was fine as a start but then I was asked for photos of my grandparents’ weddings there was nothing for it but to look through 600 thumbnails. Unless I wished to invest in photo management software, there was clearly a need for me to be able to extract the comments from the images. Then I could put them in a spreadsheet, where I could readily edit them and have separate fields for dates, etc.

My first thought was to ask on the Dyalog Forums if someone had already done this. Pierre Gilbert suggested that I try netFreeImage (https://old.aplwiki.com/netFreeImage), using .Net but I decided that this was not for me.

Instead, I adopted the direct approach, which has resulted in a deeper understanding of Dyalog and Unicode, albeit that I was sometimes left feeling a bit like the wedding guest in the Rime of the Ancient Mariner – “A sadder but a wiser man he woke the morrow morn”. I have also added to the number of tools in my toolbox.

The files from which I wished to read the comments fields had been created by scanning photographs using an Epson Perfection V39 scanner; the comments were added in Windows, using File Properties>Details>Comments. Note that the files are all in Motorola (big-endian) format.

A JPEG file contains several segments, with segments and data blocks being delimited by two-byte codes, the first byte of which is always 0xFF, the second being an identifier. After some introductory information (in TIFF format), the layout follows Exif format (see references).

From inspection of the file, I could see that the Comments data were in a section with identifier 0xE1, defined as “application specific” – but I was unable to find a description, so certain values in my function were obtained by inspection. Here is the main function:

[0] Comments←JPEG_Comments File;z;C0;Len;Loc;Exif0
[1] z←9600 NBREAD File
[2] Exif0←30 ⍝ Offset to table
[3] C0←1↑(HEXtoCHAR 2 2⍴'99CC'){⎕IO-⍨(⍺⍷⍵)/⍳⍴⍵}200↑z ⍝ Offset to table row
[4] Len←CHARtoUSI 4↑(C0+4)↓z
[5] Loc←CHARtoUSI 4↑(C0+8)↓z
[6] Comments←(HEXtoCHAR'00')~⍨Len↑(Exif0+Loc)↓z

At line 1, the first 9600 bytes of the file are read as untranslated character (data type 80); this is more than enough to include the Comments. (You may prefer to read the data as type 83, 8-bit signed integer, and convert to characters using ⎕DR as necessary.)

The Exit data is indexed in tables with four columns called Image File Directories (IFDs), and subsequent offsets are relative to the start of the first IFD, following the 30 bytes of the TIFF header (line 2). By inspection, the table row sought starts with 0x9C9C; this is followed by a two-byte data-type code which is decimal 1, defined as “unsigned byte” Then come a four byte data length and a four byte offset, both converted from character to unsigned integer by CHARtoUSI. Finally, the Comments can be extracted and the nulls between each pair of characters removed.

(Thanks to Vince Chan and Richard Smith of Dyalog for comments on an earlier draft.)

My chief sources of reference were:
http://dev.exiv2.org/projects/exiv2/wiki/The_Metadata_in_JPEG_files
http://www.itu.int/itudoc/itu-t/com16/tiff-fx/docs/tiff6.pdf
http://www.media.mit.edu/pia/Research/deepview/exif.html