The Issue

The Text Encoding Initiative Guidelines have traditionally focused on rules for the encoding of textual materials. Although there are methods for including images as an aspect of editions (as described in Chapter 22, section 3 of TEI P5, "Specific Elements for Graphic Images"), the Guidelines do not include recommendations for close linking between specific areas of text and areas of images. This document presents a number of recommendations for tackling the broad issue of incorporating image information into TEI documents.

NOTE: Conal very recently came up with the idea of using
MPEG-21 fragment identifiers [mp()] to combine a pointer to the image file and the coordinates into a single URI. This make is much easier to do things like point from one TEI text out to multiple areas on multiple image files (if you have many manuscript witnesses of the same text, or many different image types of a single manuscript). 

UPDATE: mp() schema will not work for image files.
The trouble is this: what we want to do is use a URI syntax because it's simpler than adding markup, and here's a URL "fragment identifier" syntax which allows you to scale and clip, but this particular fragment identifier syntax does not apply to JPEG or TIFF images, for instance. The trouble is that URL fragment identifier syntaxes are totally specific to particular media types (I mean "Internet Media Types" or MIME Content Types or whatever you call them). For HTML there's a simple syntax which allows you to refer to elements by @id, or to <a> elements by @name. For XML there's the same syntax for @xml:id as well. For XML you can also "XPointer" fragment identifiers, and for SVG there's svgView. There's also one for WebCGM. Now, according to this new standard (ISO/IEC 21000-17) the "mp" syntax applies only to certain media types:
Note the absence of e.g. image/tiff or image/jpeg or indeed image/anything. In fact neither of these 2 media types have any fragment identifier syntax defined. It would be possible to use such a syntax but it would not be kosher at all, and I think it'd be wrong for the TEI-C to recommend it.

We could design something similar, however. What about defining a new data type for "facsimile pointers", consisting of an image URL and a bounding box, stuck together in some way such as:
<w fax="p1edA.jpg (10,10,20,20)              p1edB.jpg (20,20,24,22)">foo</w>

So the @fax value is a list of ordered pairs, i.e. a list of URL/bounding-box pairs.

We've replaced our mp() schema examples with our own "facsimile pointers" examples, and have also kept our initial examples.

TEI Structure

Structure of TEI elements corresponds generally to the logical structure of the document, rather that its layout structure. Often a <div> (representing a chapter or other textual section) will end half way down a page, and a new <div> will start. A <div> may correspond to several pages. A <p> may start on one page and finish on another. That's why <pb/> is a milestone, rather than as a container element - textual divisions will more often than not overlap and conflict with the physical pages that contain them.

In general we're dealing with paginated media and we should have elements which correspond to pages. These elements can be linked to ranges of content. Typically I'd imagine they would correspond to a run of elements between 2 pb elements.

The page elements should not link directly to image files (as the current P5 graphic element does, or as the svg:image element does - http://www.w3.org/TR/SVG/struct.html#ImageElement ). There may be > 1 image file for a given page, taken under different lighting conditions, different resolutions, colour models, etc.

Real-Life Examples: Four projects with distinctly different requirements:

1) An edition of a single text from a single manuscript. This manuscript is not illustrated or illuminated in any way, but I am noting all abbreviations and distinctive paleographical aspects in the manuscript, and descriptors about the condition of the manuscript, so I would like to be able to have coordinates available for those.

Basic needs:


2) An edition of a single text from several different manuscripts. Again, none of these manuscripts are illustrated or illuminated, though again I will be noting abbreviations, distinctive paleographical aspects, and condition of all manuscripts, so I would like to have coordinates available for those.

Basic needs:


3) An edition developed from one manuscript. This manuscript is not illustrated or illuminated, however it does contain of several interrelated texts. On every page there is a main text, three sets of marginalia, and interlinear notes; some pages also have headings of various sorts. So we need to edit every set of text and link those texts together, in addition to linking all the texts to the coordinates of the manuscript folia.

Basic needs:

4) An edition of a single text in one manuscript. This text also consists of several different texts, however this manuscript is also heavily illustrated with sometimes two or more illustrations on a single page.

Basic needs:



Considering Encoding Guidelines based on the various needs of these Projects:


1) A method for linking pages of text to page images.

For a transcript from a single manuscript, this could be done through an attribute on the <pb> element:

<pb n="154v" url="image.jpg"/>

Even in cases where there are multiple manuscripts edited together, through the @ed attribute on <pb> it would still be possible to link directly to image files:

...<pb n="154v" ed="#A" url="A/image.jpg"/>...<pb n="20" ed="#B" url="B/image.jpg"/>...

However, there will be instances where there are multiple images for a single page (for example, images taken under regular light, scanned from microfilm, and taken under ultraviolet light; The same image in different resolutions or file type). One relatively simple approach would be to create a new attribute (@fax) with type data.pointers (rather than data.pointer), which would contain pointers to each individual file:

<pb
fax="A/image1.jpg    A/image2.jpg     B/image1.jpg"/>

Another option is to map the image files to the manuscript page numbers, most likely in a section in the TEI Header. The EPPT provides an Image Catalog that serves this purpose (see http://www.tei-c.org.uk/wiki/index.php/LegacyFacsimileMarkup#Image_Catalog), and METS also provides a method for organizing image (and text) files to show the structure of the physical and digital object (see http://www.tei-c.org.uk/wiki/index.php/LegacyFacsimileMarkup#Using_METS_to_link_text_and_image). METS is quite complicated, and though it might influence the TEI image linking recommendations, it should not be adopted in full. The EPPT Image Catalog is simple and also provides guidance for TEI image linking recommendations. Conal's v.3 Straw Man Facsimile Markup (http://www.tei-c.org.uk/wiki/index.php/StrawManFacsimileMarkup#v_v3) draws on METS, especially, and allows an editor to

a) Identify different categories for different image types, which may be based on various aspects of the image files: whether the images were taken under regular light, untraviolet light, or scanned from microfilm or transparencies; The size of the files - thumbnails, reference size, archive size; whether the file or image is annotated; and file types (TIFF, jpeg, gif, etc.). Assign each category a unique xml:id.
b) For each page, list all image files (with various metadata), linking each file to a category (using @facsimiletype as a ref to the category xml:id). Each page in turn is assigned a unique xml:id.
c) Within the body of the TEI file, use @decls on each <pb> to reference the xml:id of the containing page.


2) A method for linking areas of text to corresponding areas on the images.

Once the pages of text are linked to the corresponding image files, we need a way to use image coordinates to link sections of text within the pages to areas on the images.  As in 1), the storage place for coordinates will vary depending on the number of manuscripts and image files in the project.

We can assume that in most instances one image file will correspond to one page, and in this case it would not be necessary to relate image coordinates to <pb>. However (as I know from one of the projects I am working on now) it is possible that image files will be of facing pages. In this case we would want to differentiate between the text from the left side of the image file and the right side of the image file. This could be done in two ways:

a) Store these coordinates in a @coords attribute on the <pb> in the TEI body. This would only work if the manuscript(s) being edited only had one set of image files.
b) Store these coordinates in the image file list section of the TEI header (add a @coords attribute on <graphic>, see again http://www.tei-c.org.uk/wiki/index.php/StrawManFacsimileMarkup#v_v3). This would work no matter how many image files are in a project, since all images are listed separately by page. It should also be mentioned that in the case where two pages are one a single image file, the image file will be listed twice in the Page List, once for each containing page.

To link from sub-page areas of text to image coordinates is again more or less complicated depending on the number of manuscripts and different image files making up an edition.

a) A single text from a single manuscript with one set of representative image files. Image file references can be stored directly in <pb> through @url. Coordinates for that image file could be stored directly in the relevant text elements through a @coords attribute on the following elements:
<c> (analysis)
<abbr> (core)
<damage> (transcr)
<unclear> (core)
<add> (core)
<del> (core)
<restore> (transcr)
<sic> (core)
<handShift> (transcr)
<space> (transcr)
<fw> (transcr)
<w> (analysis)

These elements clearly represent items of interest on the physical page.

Building our own fragment identifiers we can combine a pointer to the image file and the coordinates into a single URI in place of @coords above:
<!-- the numbers in the URL fragment represent, respectively, the left, top, width, and height -->
<w fax="p1edA.jpg (10,10,20,20)  p1edB.jpg (20,20,24,22)">foo</w>


b) A single text from a single manuscript with multiple representative image files. Because reference points within the image files will be different from one another, it is not enough to simply store one set of coordinates. There are two possible options:

1. Manually create bounding-boxes for each image file and store these as part of the Page List information in the header. Assign each text element an @xml:id and link the elements to the coordinates in the Page List through @decls [modified from v v3]:

	<teiHeader>
<!-- list of pages -->

<pg xml:id="pg1" width="100mm" height="120mm">

<graphic graphicType="#access" xml:id="access-p1" url="access-p1.jpg">
<coords decls="#abbr1" x="40" y="460"/>
<coords decls="#add1" x="75" y="348"/>
</graphic>
<graphic graphicType="#ultraviolet" xml:id="uv-p1" url="uv-p1.jpg">
<coords decls="#abbr1" x="23" y="345"/>
<coords decls="#add1" x="93" y="769"/>
</graphic>

</pg>
...
</teiHeader>
<body>
...
<pb xml:id="pb1" decls="#pg1"/>
...<abbr xml:id="abbr1" type="macron">...</abbr>...<add xml:id="add1" type="marginal>...</add>
</body>

This <pg>, however, seems overly verbose. It would be better if the regions defined by the <coords> were defined only once, in a device-independent coordinate space belonging to the <pg>, and the individual <graphic>s mapped to their enclosing <pg>s by a coordinate transformation, clipping, etc.


2. Automatically map coordinates from one image file to the others. This can be done since the mappings between the pg and the graphics which it contains is well defined. For example, consider a word which appears in a 10x10 square on image-a:

<w fax="image-a.jpg (10,10,10,10)">foo</w>

We can then automatically derive the corresponding portion of image-b, too, so long as there is a <pg> element which includes a graphic whose URL starts with "image-a.jpg" as well as a graphic whose URL starts with "image-b.jpg". To calculate it, parse the fragment id of each graphic into left, top, width, and height, multiply by the graphic/@scale factor, and it should all come out in the wash.

c) A single text from multiple manuscripts with one set of representative image files for each manuscript. Since <pb> can differentiate between manuscripts through @ed (so  page breaks for multiple manuscript can co-locate within the same TEI document), coordinates for pages, if needed, can be stored directly in a @coords attribute on the various <pb> elements or within @coords attribute on individual <graphic> elements if the tei:figure/tei:graphic method of identifying page images is used.


When we come to storing coordinates in the elements representing stuff on the physical page (<c> through <fw> in 2.a above) it is rather more complicated. The approach that we prefer is to use the fragment identifiers described above, by which we can point to multiple images & coordinates within the same element.


<w fax="p1edA.jpg (10,10,20,20)              p1edB.jpg (20,20,24,22)">foo</w>


Another way to accomplish this would be to use parallel segmentation to identify the manuscript from which the various elements are from, and place a @coords or @fax attribute on the physical elements):


	<app>
<rdg wit="#A"><damage>fundulus</damage></rdg>
<rdg wit="#B">fundulus</rdg>
<app>

Another option would be to declare a @wit or @ed attribute on the physical elements and list all the variants together in <choice> - parallel to <app><rdg> rather than within <app><rdg>:


<choice>
<abbr wit="#A" coords="2,3,4,5">pncipaliam</abbr>
<abbr wit="#B" coords="5,6,7,8">principalia</abbr>
<expan>principaliam</expan>
</choice>

In instances where the differences are not parallel between the texts (manuscript damage, for example), <choice> itself is unnecessary. In this example, the text ("fundulus") is the same in both manuscripts, but is only damaged in manuscript A.


...<damage wit="#A">fundulus</damage>

This approach opens the can of worms called "overlap" - what happens when part of fundulus is damaged in manuscript A and an overlapping part is damaged (or deleted, or an addition) in manuscript B? But this is a more general problem in the TEI that I don't think we should go into here - but I did want to mention it least anyone think that it was forgotten or not considered.


The @wit values link the physical elements to the preceding <pb> elements, which has the image file attached to it through @url.


d) A single text from multiple manuscripts with several sets of representative image files for each manuscript. Combine the Page List described in 2.b.1 with the markup for multiple manuscripts from 2.c (does this even make sense)?


3) A method for describing and linking illustrations, illuminations, figures to corresponding areas on the image.


As above, the simplest approach for linking illustrations, illuminations, and figures described in a TEI file to the image files is through the mp() scheme:


 <figure>
<graphic url
="p1edA.jpg (10,10,20,20)"/>
</figure>



More advanced image/text linking could also be enabled through the incorporation of Scalable Vector Graphics (SVG), the XML standard for describing 2D graphics (such as bounding boxes on an image file) in XML.


A Quick Look Through the TEI Figures Module

Module: figures (Tables, Formulae, and Graphics, P5 chapter 22)

Elements Defined: table row cell formula figure figDesc graphic binaryObject

The "figure" module really contains three related - but very different - aspects.

  1. Tables
    Elements: table row cell
    I don't immediately see anything corresponding to this in SVG. It wouldn't make sense, anyway - tables are really for organizing textual material. Anything else?
  2. Formulae
    Elements: formula
    Formula already allows for the inclusion of elements from outside the TEI, although the examples in the guidelines do not use namespaces, and namespaces are not mentioned in the text. Should we recommend that <figure> require namespaces for elements pulled in from elsewhere? If so, what about (as in the first two examples) when the notation is non-XML? Is @notation really enough?
  3. Graphic Images
    Elements: figure graphic binaryObject figDesc
    The <figure> element is used to contain images, captions, and textual descriptions of the pictures.
    The images themselves are specified using the <graphic> element, whose url attribute provides the location of an image.
 <figure>
<graphic url="Fig1.pdf"/>
</figure>

<figure> may also contain <head> (providing a title or heading for the image) <figDesc> (a description of the image) and <p> (commentary or caption, not a description of the image).

"Where the graphic itself contains large amounts of text, perhaps with a complex structure, and perhaps difficult to distinguish from the graphic, the encoder should choose whether to regard the graphic as containing the text (in which case, a nested <text> element may be included within the <figure> element) or to regard the enclosed text as being a separate division of the <text> element in which the graphic appears. In this latter case, an appropriate divn class element may be used for the text represented within the graphic, and the <figure> element embedded within it. The choice will depend to a large degree on the encoder's understanding of the relationship between the graphic and the surrounding text."

So <figure> may also contain <text>.

TEI Elements and their SVG Equivalents (approx.)

Quick Links:

TEI Chapter 22 Tables, Formulae, and Graphics

Scalable Vector Graphics (SVG) 1.1 Specification

tei:graphic and svg:image


tei:graphic

tei:graphic "indicates the location of an inline graphic, illustration, or figure."

attributes: (In addition to global attributes)

width 	The display width of the image
Status: Mandatory when applicable
Datatype: data.outputMeasurement
height The display height of the image
Status: Mandatory when applicable
Datatype: data.outputMeasurement
scale A scale factor to be applied to the image to make it the desired display size
Status: Mandatory when applicable
Datatype: data.probability
url The target URL
Status: Mandatory when applicable
Datatype: data.pointer
Values: The name of a URL which provides the image.
mimeType The MIME type
Status: Mandatory when applicable
Datatype: data.word
Values: The MIME type to be used for the object when it is decoded
 <figure>
<graphic url="fig1.png"/>
<head>Figure One: The View from the Bridge</head>
<figDesc>A Whistleresque view showing four
or five sailing boats in the foreground, and a
series of buoys strung out between them.</figDesc>
</figure>


svg:image

svg:image "indicates that the contents of a complete file are to be rendered into a given rectangle within the current user coordinate system. The 'image' element can refer to raster image files such as PNG or JPEG or to files with MIME type of "image/svg+xml""

attributes:

x = "<coordinate>"
The x-axis coordinate of one corner of the rectangular region into which the referenced
document is placed.
If the attribute is not specified, the effect is as if a value of "0" were specified.
Animatable: yes.
y = "<coordinate>"
The y-axis coordinate of one corner of the rectangular region into which the referenced
document is placed.
If the attribute is not specified, the effect is as if a value of "0" were specified.
Animatable: yes.
width = "<length>"
The width of the rectangular region into which the referenced document is placed.
A negative value is an error (see Error processing). A value of zero disables rendering of
the element.
Animatable: yes.
height = "<length>"
The height of the rectangular region into which the referenced document is placed.
A negative value is an error (see Error processing). A value of zero disables rendering of
the element.
Animatable: yes.
xlink:href = "<uri>"
A URI reference.
Animatable: yes.
 <figure>
<svg:image xlink:href="fig1.png"/>
<head>Figure One: The View from the Bridge</head>
<figDesc>A Whistleresque view showing four
or five sailing boats in the foreground, and a
series of buoys strung out between them.</figDesc>
</figure>



Thoughts

If we were to recommend a module to import all of SVG, it would be preferable to use only svg:image and to drop tei:graphic entirely (if svg:image does indeed do everything we would need it to do in TEI). But if we don't want to "modulate" SVG (if we just say that we will refer to external SVG files if we need them), do we still want to maintain a separate tei:graphic element?


The nested grouping of TEI image elements


tei:figure/tei:graphic|tei:head|tei:figDesc

May map to the SVG:

svg:g/svg:image|svg:title|svg:desc

Examples

 <figure xml:id="id1">
<graphic width="4" height="2" url="fig1.png" mimeType="image/png"/>
<head type="image title">Figure One: The View from the Bridge</head>
<figDesc>A Whistleresque view showing four
or five sailing boats in the foreground, and a
series of buoys strung out between them.</figDesc>
</figure>
 <svg:g id="id1">
<svg:image width="4" height="2" xlink:href="fig1.png"/>
<svg:title>Figure One: The View from the Bridge</svg:title>
<svg:desc>A Whistleresque view showing four
or five sailing boats in the foreground, and a
series of buoys strung out between them.</svg:desc>
</svg:g>


Notes

 <svg:g id="id1">
<svg:image width="4" height="2" xlink:href="fig1.png"/>
<svg:title>Figure One: The View from the Bridge</svg:title>
<svg:desc>A Whistleresque view showing four
or five sailing boats in the foreground, and a
series of buoys strung out between them.</svg:desc>
<tei:p>paragraph here</tei:p>
<tei:text>text contained on the image</tei:text>
</svg:g>

Or should we encourage use of a standard TEI block element such as a <div> or <figure> to bracket together an svg element and any tei elements that need to be tied to it:

<figure>
<svg:g id="id1">
<svg:image width="4" height="2" xlink:href="fig1.png"/>
<svg:title>Figure One: The View from the Bridge</svg:title>
<svg:desc>A Whistleresque view showing four
or five sailing boats in the foreground, and a
series of buoys strung out between them.</svg:desc>
</svg:g>
<p>paragraph here</p>
<text>text contained on the image</text>
</figure>


Mapping attributes

Hold coordinates in individual elements using @mets:coords (instead of creating @tei:coords)

As described in the METS documentation: COORDS: an optional string attribute listing a set of visual coordinates within an image (still image or video frame). The COORDS attribute should be used as in HTML 4.

And in HTML 4.01: This attribute specifies the position and shape on the screen. The number and order of values depends on the shape being defined. Possible combinations:

Coordinates are relative to the top, left corner of the object. All values are lengths. All values are separated by commas.


METS defines @mets:coords on <area>. For TEI, it would be nice to have this attribute available to "regular" elements, not to a special element. The reasoning is that in many cases, especially when dealing with primary source texts, TEI elements refer not to a text in general but to the text as it appears in a specific physical document. It may not make sense to allow @mets:coords on <p>, but it may make perfect sense to allow it on those elements described in Chapter 18 Transcription of Primary Sources that relate to a specific physical occurrance:

<abbr>
<sic>
<add>
<del>
<hi>
<restore>
<gap>
<damage>
<unclear>
<space>
<fw>

SVG defines various attribute values for coordinates. The system is based on a shape (identified by the element), but the attributes vary, so one could use the same attributes in the same element in various combinations to achieve the same result:

rectangle:

@svg:x = length
@svg:y = length
(rounded edges:
@svg:rx = length
@svg:ry = length)

circle:

@svg:cx = "<coordinate>"
The x-axis coordinate of the center of the circle.
@svg:cy = "<coordinate>"
The y-axis coordinate of the center of the circle.
@svg:r = "<length>"
The radius of the circle.

ellipse:

@svg:cx = "<coordinate>"
The x-axis coordinate of the center of the ellipse.
@svg:cy = "<coordinate>"
The y-axis coordinate of the center of the ellipse.
@svg:rx = "<length>"
The x-axis radius of the ellipse.
@svg:ry = "<length>"
The y-axis radius of the ellipse.

polygon:

@svg:points = "<list-of-points>"
The points that make up the polygon. All coordinate values are in the user coordinate system.

@svg:points seems to me to be very similar to @mets:coords, except that it cannot be used to form a circle (only boundaries with straight edges)

There may be instances where one would want to use @mets:coords (for simple circles and bounding boxes) and other times when it would make more sense to use svg:x/y/width/height etc. (for more complex shapes).

We need to continue looking at SVG, whether we want to be able to import it in a module or link to external files. Or both.


tei:binaryObject

A rough equivalent for tei:binaryObject in svg. tei:binaryObject is an svg:image whose xlink:href uses the "data" URL scheme.

<svg width="4in" height="3in" version="1.1"
xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink">
<desc>This graphic links to a picture of Larry Masinter</desc>
<image x="200" y="200" width="48px" height="48px"
xlink:href="data:image/gif;base64,R0lGODdhMAAwAPAAAAAAAP///ywAAAAAMAAw
AAAC8IyPqcvt3wCcDkiLc7C0qwyGHhSWpjQu5yqmCYsapyuvUUlvONmOZtfzgFz
ByTB10QgxOR0TqBQejhRNzOfkVJ+5YiUqrXF5Y5lKh/DeuNcP5yLWGsEbtLiOSp
a/TPg7JpJHxyendzWTBfX0cxOnKPjgBzi4diinWGdkF8kjdfnycQZXZeYGejmJl
ZeGl9i2icVqaNVailT6F5iJ90m6mvuTS4OK05M0vDk0Q4XUtwvKOzrcd3iq9uis
F81M1OIcR7lEewwcLp7tuNNkM3uNna3F2JQFo97Vriy/Xl4/f1cf5VWzXyym7PH
hhx4dbgYKAAA7">
<title>Larry Masinter</title>
</image>
</svg>


Using SVG with TEI

In some instances, it makes sense to enable broad use of SVG throughout the TEI document. For instance, although most of the examples here relate to the embedding of bitmap data, the real power of SVG is in describing vector information, and it's the perfect tool for capturing dividing lines on the page, shapes, blocks of text set off from the page, decorative flourishes, simple diagrams, graphs, logos and so on. It's not the case that, for example, illuminated manuscript pages could be described in SVG in place of the use of hi-res page images, but it would be useful to specify the layout of a complex MS page (where there might be annotations on commentaries on commentaries) in terms of SVG shape structures, with the text element blocks embedded inside them, enabling a usefully approximate rendering of the page layout incorporating the transcription.