[Gllug] Describing pictures in XML

salsaman at xs4all.nl salsaman at xs4all.nl
Fri Jun 18 10:55:15 UTC 2010

On Fri, June 18, 2010 11:53, Richard Lewis wrote:
> At Thu, 17 Jun 2010 21:58:14 +0100,
> James Courtier-Dutton wrote:
>> Hi,
>> Does anyone know if there is a standard XML format to describe pictures.
>> I am thinking of something like facebook, where one can highlight
>> areas and put names to them.
>> I want to have a picture.jpg file, and keep a picture.jpg.xml file next
>> to it.
>> In the .xml file I wish to store peoples names, highlighted area etc.
>> that describe the picture.
>> In this way, I can archive the pictures, and know that they will be
>> readable, including the meta data in many years time.
>> Most picture applications I know of seem to store this information in
>> databases that make it difficult to extract the meta data for just one
>> of the photos.
>> I could then add search tools, that could show me all pictures of person
>> A.
>> With face recognition software, I could also have it scan all my
>> pictures and find any other pictures where person A appears and update
>> the .XML files appropriately.
> The Text Encoding Initiative <http://www.tei-c.org/> specify in their
> guidelines a way of encoding information about images. Princiaplly,
> the <surface> element is used:
> http://www.tei-c.org/release/doc/tei-p5-doc/en/html/ref-surface.html
> TEI is quite a large specification and learning it is quite, erm,
> tedious. (Can be very worthwhile if you're interested in applications
> of computers in the humanities.) Fortunately, there's a tool for
> generating TEI image markup in a pointy-clicky way:
> http://tapor.uvic.ca/~mholmes/image_markup/

I don't know if automating the process would be any good, but there is an
algorithm for classifying similar looking images, it is called the Haar


When I was researching this for a presentation a couple of years ago I
found the following project which implements it:


I pulled out the code which does the actual haar calculation and made it
into a standalone plugin. If you are interested I can dig out the code
which categorises (enumerates) a directory of images and writes the
results to a flat file.


Gllug mailing list  -  Gllug at gllug.org.uk

More information about the GLLUG mailing list