The Issue
The Text Encoding Initiative Guidelines have traditionally focused on rules for
the encoding of textual materials. Although there are methods for including
images as an aspect of editions (as described in Chapter 22, section 3 of TEI
P5, "Specific Elements for Graphic Images"), the Guidelines do not include
recommendations for close linking between specific areas of text and areas of
images. This document presents a number of recommendations for tackling the
broad issue of incorporating image information into TEI documents.
NOTE: Conal very recently came up with
the idea of using
MPEG-21 fragment
identifiers [mp()] to combine a pointer to the image file and the coordinates
into a single URI. This make is much easier to do things like point from one TEI
text out to multiple areas on multiple image files (if you have many manuscript
witnesses of the same text, or many different image types of a single
manuscript).
UPDATE: mp() schema will not work for
image files. The trouble is this: what we want to do is use a URI syntax
because it's simpler than adding markup, and here's a URL "fragment identifier"
syntax which allows you to scale and clip, but this particular fragment
identifier syntax does not apply to JPEG or TIFF images, for instance. The
trouble is that URL fragment identifier syntaxes are totally specific to
particular media types (I mean "Internet Media Types" or MIME Content Types or
whatever you call them). For HTML there's a simple syntax which allows you to
refer to elements by @id, or to <a> elements by @name. For XML there's the
same syntax for @xml:id as well. For XML you can also "XPointer" fragment
identifiers, and for SVG there's svgView. There's also one for WebCGM. Now,
according to this new standard (ISO/IEC 21000-17) the "mp" syntax applies only
to certain media types:
-
audio/mpeg
-
video/mpeg
-
video/mp4
-
audio/mp4
-
application/mp4
-
video/MPEG4-visual
-
application/mp21
Note the absence of e.g. image/tiff or image/jpeg or indeed
image/anything. In fact neither of these 2 media types have any fragment
identifier syntax defined. It would be possible to use such a syntax but it
would not be kosher at all, and I think it'd be wrong for the TEI-C to recommend
it.
We could design something similar, however. What about defining a new data type
for "facsimile pointers", consisting of an image URL and a bounding box, stuck
together in some way such as:
<w fax="p1edA.jpg (10,10,20,20)
p1edB.jpg (20,20,24,22)">foo</w>
So the @fax value is a list of ordered pairs, i.e. a list of URL/bounding-box
pairs.
We've replaced our mp() schema examples with our own "facsimile pointers"
examples, and have also kept our initial examples.
TEI
Structure
Structure of TEI elements corresponds generally to the
logical structure of the document, rather that its layout structure. Often a
<div> (representing a chapter or other textual section) will end half way
down a page, and a new <div> will start. A <div> may correspond to
several pages. A <p> may start on one page and finish on another. That's
why <pb/> is a milestone, rather than as a container element - textual
divisions will more often than not overlap and conflict with the physical pages
that contain them.
In general we're dealing with paginated media and we should have elements which
correspond to pages. These elements can be linked to ranges of content.
Typically I'd imagine they would correspond to a run of elements between 2 pb
elements.
The page elements should not link directly to image files (as the current P5
graphic element does, or as the svg:image element does -
http://www.w3.org/TR/SVG/struct.html#ImageElement
). There may be > 1 image file for a given page, taken under different
lighting conditions, different resolutions, colour models, etc.
Real-Life Examples: Four projects with
distinctly different requirements:
1) An edition of a single text from a single manuscript. This manuscript is not
illustrated or illuminated in any way, but I am noting all abbreviations and
distinctive paleographical aspects in the manuscript, and descriptors about the
condition of the manuscript, so I would like to be able to have coordinates
available for those.
Basic needs:
-
Connect pages of text to pages on the physical
manuscript.
-
Connect areas of text to the corresponding areas on the
physical manuscript.
2) An edition of a single text from several different manuscripts.
Again, none of these manuscripts are illustrated or illuminated, though again I
will be noting abbreviations, distinctive paleographical aspects, and condition
of all manuscripts, so I would like to have coordinates available for those.
Basic needs:
-
Connect pages of text to pages on the physical manuscript. I
might also want to have a way to show how the manuscripts variously
correspond to one another.
-
Connect areas of text to the corresponding areas on the
physical manuscript - again, by way of coordinates mapped to a page
image.
3) An edition developed from one manuscript. This manuscript is not illustrated
or illuminated, however it does contain of several interrelated texts. On every
page there is a main text, three sets of marginalia, and interlinear notes; some
pages also have headings of various sorts. So we need to edit every set of text
and link those texts together, in addition to linking all the texts to the
coordinates of the manuscript folia.
Basic needs:
-
Connect the textual areas to corresponding areas on the
physical manuscript.
-
Link the various textual areas together, according to their
layout on the page.
4) An edition of a single text in one manuscript. This text also consists of
several different texts, however this manuscript is also heavily illustrated
with sometimes two or more illustrations on a single page.
Basic needs:
-
Connect the text and images to the physical pages on which
they reside.
-
Describe the illustrations and link those descriptions to the
areas on the physical manuscript.
-
Link the textual areas and illustrations, according to their
layout on the page
Considering Encoding Guidelines based on the various needs of these
Projects:
1) A method for linking pages of text to page
images.
For a transcript from a single manuscript, this could be done through an
attribute on the <pb> element:
<pb n="154v" url="image.jpg"/>
Even in cases where there are multiple manuscripts edited together, through the
@ed attribute on <pb> it would still be possible to link directly to image
files:
...<pb n="154v" ed="#A" url="A/image.jpg"/>...<pb n="20" ed="#B"
url="B/image.jpg"/>...
However, there will be instances where there are multiple
images for a single page (for example, images taken under regular light, scanned
from microfilm, and taken under ultraviolet light; The same image in different
resolutions or file type).
One
relatively simple approach would be to create a new attribute (@fax) with type
data.pointers (rather than data.pointer), which would contain pointers to each
individual file:
<pb
fax="A/image1.jpg
A/image2.jpg B/image1.jpg"/>
Another
option is to map the image files to the manuscript page numbers, most likely in
a section in the TEI Header. The EPPT provides an Image Catalog that serves this
purpose (see
http://www.tei-c.org.uk/wiki/index.php/LegacyFacsimileMarkup#Image_Catalog), and
METS also provides a method for organizing image (and text) files to show the
structure of the physical and digital object (see
http://www.tei-c.org.uk/wiki/index.php/LegacyFacsimileMarkup#Using_METS_to_link_text_and_image).
METS is quite complicated, and though it might influence the TEI image linking
recommendations, it should not be adopted in full. The EPPT Image Catalog is
simple and also provides guidance for TEI image linking recommendations. Conal's
v.3 Straw Man Facsimile Markup
(http://www.tei-c.org.uk/wiki/index.php/StrawManFacsimileMarkup#v_v3) draws on
METS, especially, and allows an editor to
a) Identify different categories for different image types,
which may be based on various aspects of the image files: whether the images
were taken under regular light, untraviolet light, or scanned from microfilm
or transparencies; The size of the files - thumbnails, reference size, archive
size; whether the file or image is annotated; and file types (TIFF, jpeg, gif,
etc.). Assign each category a unique xml:id.
b) For each page, list all image files (with various metadata), linking each
file to a category (using @facsimiletype as a ref to the category xml:id).
Each page in turn is assigned a unique xml:id.
c) Within the body of the TEI file, use @decls on each <pb> to reference
the xml:id of the containing page.
2) A method for linking
areas of text to corresponding areas on
the images.
Once the pages of text are linked to the corresponding image files, we need a
way to use image coordinates to link sections of text within the pages to areas
on the images. As in 1), the storage place for coordinates will vary
depending on the number of manuscripts and image files in the project.
We can assume that in most instances one image file will correspond to one page,
and in this case it would not be necessary to relate image coordinates to
<pb>. However (as I know from one of the projects I am working on now) it
is possible that image files will be of facing pages. In this case we would want
to differentiate between the text from the left side of the image file and the
right side of the image file. This could be done in two ways:
a) Store these coordinates in a @coords attribute on the
<pb> in the TEI body. This would only work if the manuscript(s) being
edited only had one set of image files.
b) Store these coordinates in the image file list section of the TEI header
(add a @coords attribute on <graphic>, see again
http://www.tei-c.org.uk/wiki/index.php/StrawManFacsimileMarkup#v_v3). This
would work no matter how many image files are in a project, since all images
are listed separately by page. It should also be mentioned that in the case
where two pages are one a single image file, the image file will be listed
twice in the Page List, once for each containing page.
To link from sub-page areas of text to image coordinates is again
more or less complicated depending on the number of manuscripts and different
image files making up an edition.
a) A single text from a single manuscript
with one set of representative image
files. Image file references can be stored directly in <pb>
through @url. Coordinates for that image file could be stored directly in the
relevant text elements through a @coords attribute on the following elements:
<c> (analysis)
<abbr> (core)
<damage> (transcr)
<unclear> (core)
<add> (core)
<del> (core)
<restore> (transcr)
<sic> (core)
<handShift> (transcr)
<space> (transcr)
<fw> (transcr)
<w> (analysis)
These elements clearly represent items of interest on the physical page.
Building our own fragment identifiers
we can combine a pointer to the image file and the coordinates into a single
URI in place of @coords above:
<!-- the numbers in the
URL fragment represent, respectively, the left, top, width, and height
-->
<w fax="p1edA.jpg (10,10,20,20) p1edB.jpg
(20,20,24,22)">foo</w>
b) A single text from a single manuscript
with multiple representative image
files. Because reference points within the image files will be
different from one another, it is not enough to simply store one set of
coordinates. There are two possible options:
1. Manually create
bounding-boxes for each image file and store these as part of the Page List
information in the header. Assign each text element an @xml:id and link the
elements to the coordinates in the Page List through @decls [modified from v
v3]:
<teiHeader>
<!-- list of pages -->
<pg xml:id="pg1" width="100mm" height="120mm">
<graphic graphicType="#access" xml:id="access-p1" url="access-p1.jpg">
<coords decls="#abbr1" x="40" y="460"/>
<coords decls="#add1" x="75" y="348"/>
</graphic>
<graphic graphicType="#ultraviolet" xml:id="uv-p1" url="uv-p1.jpg">
<coords decls="#abbr1" x="23" y="345"/>
<coords decls="#add1" x="93" y="769"/>
</graphic>
</pg>
...
</teiHeader>
<body>
...
<pb xml:id="pb1" decls="#pg1"/>
...<abbr xml:id="abbr1" type="macron">...</abbr>...<add xml:id="add1" type="marginal>...</add>
</body>
This <pg>, however,
seems overly verbose. It would be better if the regions defined by the
<coords> were defined only once, in a device-independent coordinate
space belonging to the <pg>, and the individual <graphic>s mapped
to their enclosing <pg>s by a coordinate transformation, clipping, etc.
2.
Automatically
map coordinates from one image
file to the others. This can be done since the mappings between the
pg and the graphics which it contains is well defined. For example, consider
a word which appears in a 10x10 square on image-a:
<w
fax="image-a.jpg
(10,10,10,10)">foo</w>
We can then automatically
derive the corresponding portion of image-b, too, so long as there is a
<pg> element which includes a graphic whose URL starts with
"image-a.jpg" as well as a graphic whose URL starts with "image-b.jpg". To
calculate it, parse the fragment id of each graphic into left, top, width,
and height, multiply by the graphic/@scale factor, and it should all come
out in the wash.
c) A single text from multiple manuscripts
with one set of
representative image
files
for each manuscript. Since <pb> can differentiate between
manuscripts through @ed (so page breaks for multiple manuscript can
co-locate within the same TEI document), coordinates for pages, if needed, can
be stored directly in a @coords attribute on the various <pb> elements
or within @coords attribute on individual <graphic> elements if the
tei:figure/tei:graphic method of identifying page images is
used.
When we come to storing coordinates in the elements representing
stuff on the physical page (<c> through <fw> in 2.a above) it is
rather more complicated. The
approach that we prefer is to use the fragment identifiers described above, by
which we can point to multiple images & coordinates within the same
element.
<w fax="p1edA.jpg (10,10,20,20)
p1edB.jpg (20,20,24,22)">foo</w>
Another way to accomplish this would be to use parallel
segmentation to identify the manuscript from which the various elements are
from, and place a @coords or @fax attribute on the physical elements):
<app>
<rdg wit="#A"><damage>fundulus</damage></rdg>
<rdg wit="#B">fundulus</rdg>
<app>
Another option would be to declare a @wit or @ed attribute on the
physical elements and list all the variants together in <choice> -
parallel to <app><rdg> rather than within
<app><rdg>:
<choice>
<abbr wit="#A" coords="2,3,4,5">pncipaliam</abbr>
<abbr wit="#B" coords="5,6,7,8">principalia</abbr>
<expan>principaliam</expan>
</choice>
In instances where the
differences are not parallel between the texts (manuscript damage, for
example), <choice> itself is unnecessary. In this example, the text
("fundulus") is the same in both manuscripts, but is only damaged in
manuscript A.
...<damage wit="#A">fundulus</damage>
This approach opens the can of worms called "overlap" - what
happens when part of fundulus is damaged in manuscript A and an overlapping
part is damaged (or deleted, or an addition) in manuscript B? But this is a
more general problem in the TEI that I don't think we should go into here -
but I did want to mention it least anyone think that it was forgotten or not
considered.
The @wit values link the
physical elements to the preceding <pb> elements, which has the image
file attached to it through @url.
d) A single text from multiple
manuscripts with several sets of
representative image files for each manuscript. Combine the Page List
described in 2.b.1 with the markup for multiple manuscripts from 2.c (does
this even make sense)?
3) A method for
describing and
linking illustrations, illuminations,
figures to corresponding areas on the
image.
As above, the simplest
approach for linking illustrations, illuminations, and figures described in a
TEI file to the image files is through the mp() scheme:
<figure>
<graphic url="p1edA.jpg (10,10,20,20)"/>
</figure>
More advanced image/text
linking could also be enabled through the incorporation of Scalable Vector
Graphics (SVG), the XML standard for describing 2D graphics (such as bounding
boxes on an image file) in XML.
A Quick Look Through the TEI Figures Module
Module: figures (Tables, Formulae, and Graphics,
P5
chapter 22)
Elements Defined: table row cell formula figure figDesc graphic
binaryObject
The "figure" module really contains three related - but very
different - aspects.
-
Tables
Elements: table row cell
I don't immediately see anything corresponding to this in SVG. It wouldn't
make sense, anyway - tables are really for organizing textual material.
Anything else?
-
Formulae
Elements: formula
Formula already allows for the inclusion of elements from outside the TEI,
although the
examples
in the guidelines do not use namespaces, and namespaces are not
mentioned in the text. Should we recommend that <figure> require
namespaces for elements pulled in from elsewhere? If so, what about (as in
the first two examples) when the notation is non-XML? Is @notation really
enough?
-
How will this section generally be used? Will the most
common usages be MathML or ChemML (or something similar from the
sciences), or perhaps simpler formulas expressed in available Unicode
characters?
"By default, a <formula> is assumed to contain character data
which is not validated in any way"
Is this really a good idea? This means that, if you have a formula that
contains non-standard characters, you would be unable to use gaiji
without customization. That seems strange.
-
Graphic Images
Elements: figure graphic binaryObject figDesc
The <figure> element is used to contain images, captions, and textual
descriptions of the pictures.
The images themselves are specified using the <graphic> element, whose
url attribute provides the location of an image.
<figure>
<graphic url="Fig1.pdf"/>
</figure>
<figure> may also contain <head> (providing a title
or heading for the image) <figDesc> (a description of the image) and
<p> (commentary or caption, not a description of the image).
"Where the graphic itself contains large amounts of text, perhaps
with a complex structure, and perhaps difficult to distinguish from the
graphic, the encoder should choose whether to regard the graphic as containing
the text (in which case, a nested <text> element may be included within
the <figure> element) or to regard the enclosed text as being a separate
division of the <text> element in which the graphic appears. In this
latter case, an appropriate divn class element may be used for the text
represented within the graphic, and the <figure> element embedded within
it. The choice will depend to a large degree on the encoder's understanding of
the relationship between the graphic and the surrounding text."
So <figure> may also contain <text>.
TEI Elements and their SVG Equivalents (approx.)
Quick Links:
TEI Chapter 22
Tables,
Formulae, and Graphics
Scalable Vector Graphics (SVG)
1.1
Specification
tei:graphic and svg:image
tei:graphic
tei:graphic "indicates the location of an inline graphic,
illustration, or figure."
attributes: (In addition to global attributes)
width The display width of the image
Status: Mandatory when applicable
Datatype: data.outputMeasurement
height The display height of the image
Status: Mandatory when applicable
Datatype: data.outputMeasurement
scale A scale factor to be applied to the image to make it the desired display size
Status: Mandatory when applicable
Datatype: data.probability
url The target URL
Status: Mandatory when applicable
Datatype: data.pointer
Values: The name of a URL which provides the image.
mimeType The MIME type
Status: Mandatory when applicable
Datatype: data.word
Values: The MIME type to be used for the object when it is decoded
<figure>
<graphic url="fig1.png"/>
<head>Figure One: The View from the Bridge</head>
<figDesc>A Whistleresque view showing four
or five sailing boats in the foreground, and a
series of buoys strung out between them.</figDesc>
</figure>
svg:image
svg:image "indicates that the contents of a complete file are to
be rendered into a given rectangle within the current user coordinate system.
The 'image' element can refer to raster image files such as PNG or JPEG or to
files with MIME type of "image/svg+xml""
attributes:
x = "<coordinate>"
The x-axis coordinate of one corner of the rectangular region into which the referenced
document is placed.
If the attribute is not specified, the effect is as if a value of "0" were specified.
Animatable: yes.
y = "<coordinate>"
The y-axis coordinate of one corner of the rectangular region into which the referenced
document is placed.
If the attribute is not specified, the effect is as if a value of "0" were specified.
Animatable: yes.
width = "<length>"
The width of the rectangular region into which the referenced document is placed.
A negative value is an error (see Error processing). A value of zero disables rendering of
the element.
Animatable: yes.
height = "<length>"
The height of the rectangular region into which the referenced document is placed.
A negative value is an error (see Error processing). A value of zero disables rendering of
the element.
Animatable: yes.
xlink:href = "<uri>"
A URI reference.
Animatable: yes.
-
Are x and y necessary?
-
xlink:href instead of url = this is nice.
<figure>
<svg:image xlink:href="fig1.png"/>
<head>Figure One: The View from the Bridge</head>
<figDesc>A Whistleresque view showing four
or five sailing boats in the foreground, and a
series of buoys strung out between them.</figDesc>
</figure>
Thoughts
If we were to recommend a module to import all of SVG, it would
be preferable to use only svg:image and to drop tei:graphic entirely (if
svg:image does indeed do everything we would need it to do in TEI). But if we
don't want to "modulate" SVG (if we just say that we will refer to external
SVG files if we need them), do we still want to maintain a separate
tei:graphic element?
The nested grouping of TEI image elements
tei:figure/tei:graphic|tei:head|tei:figDesc
-
figure "contains a block containing graphics, illustrations, or
figures."
-
graphic "indicates the location of an inline graphic,
illustration, or figure."
-
head "contains any type of heading, for example the title of a
section, or the heading of a list, glossary, manuscript description, etc."
-
figDesc "(Description of Figure) contains a brief prose
description of the appearance or content of a graphic figure, for use when
documenting an image without displaying it."
May map to the SVG:
svg:g/svg:image|svg:title|svg:desc
-
"The 'g' element is a container element for grouping together
related graphics elements."
-
"The 'image' element indicates that the contents of a complete
file are to be rendered into a given rectangle within the current user
coordinate system. The 'image' element can refer to raster image files such
as PNG or JPEG or to files with MIME type of "image/svg+xml""
-
"Each container element or graphics element in an SVG drawing
can supply a 'desc' and/or a 'title' description string where the
description is text-only."
-
"The 'title' child element to an ... element serves the
purposes of identifying the content of the given SVG document
fragment."
Examples
<figure xml:id="id1">
<graphic width="4" height="2" url="fig1.png" mimeType="image/png"/>
<head type="image title">Figure One: The View from the Bridge</head>
<figDesc>A Whistleresque view showing four
or five sailing boats in the foreground, and a
series of buoys strung out between them.</figDesc>
</figure>
<svg:g id="id1">
<svg:image width="4" height="2" xlink:href="fig1.png"/>
<svg:title>Figure One: The View from the Bridge</svg:title>
<svg:desc>A Whistleresque view showing four
or five sailing boats in the foreground, and a
series of buoys strung out between them.</svg:desc>
</svg:g>
Notes
-
svg:g has @id, not @xml:id.
-
tei:graphic/@scale and svg:image/@preserveAspectRatio
(http://www.w3.org/TR/SVG/coords.html#preserveAspectRatio)
are, I believe, similar, but they are Greek to me. I need help here.
-
svg:image lacks @mimeType, but is it necessary?
-
svg:title does not seem to have a type attribute, but since
this element would be used only in reference to image titles, I don't think
that this is a problem.
-
As noted in a previous section, <figure> may also contain
<p> or <text>. There are no equivalent elements in SVG. Would it
make sense to include those elements in an SVG group as TEI namespace?
<svg:g id="id1">
<svg:image width="4" height="2" xlink:href="fig1.png"/>
<svg:title>Figure One: The View from the Bridge</svg:title>
<svg:desc>A Whistleresque view showing four
or five sailing boats in the foreground, and a
series of buoys strung out between them.</svg:desc>
<tei:p>paragraph here</tei:p>
<tei:text>text contained on the image</tei:text>
</svg:g>
Or should we encourage use of a standard TEI block element such
as a <div> or <figure> to bracket together an svg element and any
tei elements that need to be tied to it:
<figure>
<svg:g id="id1">
<svg:image width="4" height="2" xlink:href="fig1.png"/>
<svg:title>Figure One: The View from the Bridge</svg:title>
<svg:desc>A Whistleresque view showing four
or five sailing boats in the foreground, and a
series of buoys strung out between them.</svg:desc>
</svg:g>
<p>paragraph here</p>
<text>text contained on the image</text>
</figure>
Mapping attributes
Hold coordinates in individual elements using @mets:coords
(instead of creating @tei:coords)
As described in the
METS
documentation: COORDS: an optional string attribute listing a set of
visual coordinates within an image (still image or video frame). The COORDS
attribute should be used as in HTML 4.
And in
HTML
4.01: This attribute specifies the position and shape on the screen. The
number and order of values depends on the shape being defined. Possible
combinations:
-
rect: left-x, top-y, right-x, bottom-y.
-
circle: center-x, center-y, radius. Note. When the radius value
is a percentage value, user agents should calculate the final radius value
based on the associated object's width and height. The radius should be the
smaller value of the two.
-
poly: x1, y1, x2, y2, ..., xN, yN. The first x and y coordinate
pair and the last should be the same to close the polygon. When these
coordinate values are not the same, user agents should infer an additional
coordinate pair to close the polygon.
Coordinates are relative to the top, left corner of the object.
All values are lengths. All values are separated by commas.
METS defines @mets:coords on <area>. For TEI, it would be nice to have
this attribute available to "regular" elements, not to a special element. The
reasoning is that in many cases, especially when dealing with primary source
texts, TEI elements refer not to a text in general but to the text as it
appears in a specific physical document. It may not make sense to allow
@mets:coords on <p>, but it may make perfect sense to allow it on those
elements described in Chapter 18
Transcription
of Primary Sources that relate to a specific physical occurrance:
<abbr>
<sic>
<add>
<del>
<hi>
<restore>
<gap>
<damage>
<unclear>
<space>
<fw>
SVG defines various attribute values for coordinates. The system
is based on a shape (identified by the element), but the attributes vary, so
one could use the same attributes in the same element in various combinations
to achieve the same result:
rectangle:
@svg:x = length
@svg:y = length
(rounded edges:
@svg:rx = length
@svg:ry = length)
circle:
@svg:cx = "<coordinate>"
The x-axis coordinate of the center of the circle.
@svg:cy = "<coordinate>"
The y-axis coordinate of the center of the circle.
@svg:r = "<length>"
The radius of the circle.
ellipse:
@svg:cx = "<coordinate>"
The x-axis coordinate of the center of the ellipse.
@svg:cy = "<coordinate>"
The y-axis coordinate of the center of the ellipse.
@svg:rx = "<length>"
The x-axis radius of the ellipse.
@svg:ry = "<length>"
The y-axis radius of the ellipse.
polygon:
@svg:points = "<list-of-points>"
The points that make up the polygon. All coordinate values are in the user coordinate system.
@svg:points seems to me to be very similar to @mets:coords,
except that it cannot be used to form a circle (only boundaries with straight
edges)
There may be instances where one would want to use @mets:coords
(for simple circles and bounding boxes) and other times when it would make
more sense to use svg:x/y/width/height etc. (for more complex shapes).
We need to continue looking at SVG, whether we want to be able to
import it in a module or link to external files. Or both.
tei:binaryObject
A rough equivalent for tei:binaryObject in svg. tei:binaryObject
is an svg:image whose xlink:href uses the
"data"
URL scheme.
<svg width="4in" height="3in" version="1.1"
xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink">
<desc>This graphic links to a picture of Larry Masinter</desc>
<image x="200" y="200" width="48px" height="48px"
xlink:href="data:image/gif;base64,R0lGODdhMAAwAPAAAAAAAP///ywAAAAAMAAw
AAAC8IyPqcvt3wCcDkiLc7C0qwyGHhSWpjQu5yqmCYsapyuvUUlvONmOZtfzgFz
ByTB10QgxOR0TqBQejhRNzOfkVJ+5YiUqrXF5Y5lKh/DeuNcP5yLWGsEbtLiOSp
a/TPg7JpJHxyendzWTBfX0cxOnKPjgBzi4diinWGdkF8kjdfnycQZXZeYGejmJl
ZeGl9i2icVqaNVailT6F5iJ90m6mvuTS4OK05M0vDk0Q4XUtwvKOzrcd3iq9uis
F81M1OIcR7lEewwcLp7tuNNkM3uNna3F2JQFo97Vriy/Xl4/f1cf5VWzXyym7PH
hhx4dbgYKAAA7">
<title>Larry Masinter</title>
</image>
</svg>
Using SVG with TEI
In some instances, it makes sense to enable broad use of SVG
throughout the TEI document. For instance, although most of the examples here
relate to the embedding of bitmap data, the real power of SVG is in describing
vector information, and it's the perfect tool for capturing dividing lines on
the page, shapes, blocks of text set off from the page, decorative flourishes,
simple diagrams, graphs, logos and so on. It's not the case that, for example,
illuminated manuscript pages could be described in SVG in place of the use of
hi-res page images, but it would be useful to specify the layout of a complex
MS page (where there might be annotations on commentaries on commentaries) in
terms of SVG shape structures, with the text element blocks embedded inside
them, enabling a usefully approximate rendering of the page layout
incorporating the transcription.