TEI implementation

The source text has been encoded in an XML document (TEI P5 Lite). The tags and the way they are used are described below. (Not included are tags from the TEI header and most attributes.)

Volumes

  • All three volumes of the “Teutsche Academie” are enclosed by a <text> block, which together are grouped in a <group> element.
  • Each volume is divided into front matter and content by <front> and <body> tags.

Pages

  • All pages start with a <pb/> tag. The tag’s “n” attribute contains the unique page title.

Columns

  • Start of content that takes up the width of the page: <milestone unit="column" n="none"/>
  • Start of content in a left column: <milestone unit="column" n="left"/>
  • Start of content in a center column: <milestone unit="column" n="middle"/>
  • Start of content in a right column: <milestone unit="column" n="right"/>
  • These tags are used whereever a column break occurs and also at the start of each page.

Text structure

  • The title pages of the volumes are tagged with <titlePage>. Content on these pages is tagged using <byline>, <docAuthor>, <docDate>, <docImprint>, <docTitle>, <pubPlace>, <publisher> and <titlePart>.
  • The structure of the content is encoded by <div> tags (nested hierarchically), some of which is specified further using the “type” attribute; values used are “frontispiece” and “title”.
  • Headlines of these text structure elements are marked with <head>...</head> tags.
  • For summaries at the beginning of chapters, <argument> is used.
  • Some parts of the text are divided into comparatively small sections and have headline-like text that appears inside columns. To keep the text structure managable und understandable, in these cases you will find <p rend="headline">...</p> instead of <div> blocks plus <head>. Accordingly, summaries for text following such headlines is not tagged with <argument>, but with <p rend="argumentlike">.
  • Paragraphs are enclosed by <p> tags, lines of verse by <l>, Groups of lines of verse by <lg>. On index pages, <list> and <item> are used.
  • Text in marginalia is enclosed by <note place="margin" anchored="0">...</note> markup. This whole piece of markup is embedded into the main text just before the start of the line which starts at the same vertical position as the marginalia text; in some cases, this position has been slightly varied, to prevent conflicts with markup in the main text.
  • Footnotes: <note place="foot">...</note> – and the symbol which signifies the footnote has been removed. We do not differentiate between footnotes appearing below the text (at the bottom of the same page) or those which are located in the marginalia column.

People, places, works of art, publications

  • Occurrences of a person: <rs type="person">
  • Occurrences of a place: <rs type="place">
  • Occurrences of an artwork: <rs type="artwork">
  • Occurrences of a publication/writing: <rs type="bibliography">
  • In each of these cases, the value of the attribute “key” contains numeric primary keys of records in a relational database.
  • In the few cases where such markup is interrupted by unrelated content (for instance, image pages), two <rs>...</rs> pairs are used, one before and one after the unrelated content.

Images and text in images

  • Discrete images have been encoded using <figure> tags (containing <graphic>), while inline glyphs and images are tagged using only <graphic>
  • Image titles are marked using <p rend="caption"> tags.
  • Text displayed in an image is enclosed by <p> tags, which each distinct text block having its own pair of tags. The order of these text blocks follows the Western reading direction: line by line from the top left to the lower right corner, though this order may be ignored to keep blocks of text together which obviously form a unit.
  • Signatures (typically below images): <seg type="signature">...</seg>
  • Privileges/dedications (typically below images): <seg type="dedication">...</seg>

Typeface

  • All text not marked using the TEI global attribute “rend” as “antiqua” or “italic” is to be regarded as Gothic print. Exception: numbers (mainly consisting of antiqua digits) are usually not marked using “rend”. The same applies to the use of commas, which are usually only used in Antiqua parts of the text, but are not always enclosed by tags having a “rend” attribute with “antiqua” or “italic” as value.
  • With the exception of text in italics, the use of different Antiqua typefaces is not tagged. All text in italics are implcitly to be regarded as Antiqua in italics.
  • We do not distinguish between different types of Gothic print.
  • Graphically designed characters at the beginning of paragraphs: <hi rend="initial">N</hi>Achdem
  • In image signatures and privileges, no difference between typefaces was made (these are mostly in italics, anyway)

Internal cross-references

  • For links to other parts of the text, <ref target="...">...</ref> tags are used.
  • With the exception of <corr>...</corr>, <sic>...</sic>, <choice>...</choice> and <milestone/>, all elements in the TEI document’s body can be addressed.
  • When a link refers to a certain point in the text where there is no element that could be linked to, an <anchor/> tag has been inserted.
  • Additionally, <anchor /> tags with “vita-start” as value of the type attribute are used to track the beginnings of artists’ vitae. In such cases, the corresponding person’s ID is contained as value of attribute n.

Languages

  • Non-German text has is marked either by <foreign xml:lang="...">...</foreign> tags or by means of xml:lang attributes. Any text not marked in one of these ways is German.
  • These ISO language codes are used:
    • Coptic: cop
    • Danish: da
    • Dutch: nl
    • English: en
    • French: fr
    • German: de (only for inserted text in non-German sections)
    • Greek: el. The Greek in the “Teutsche Academie” contains elements of ancient Greek and modern Greek as well as characters from antiquity, much of it orthographically or grammatically incorrect. Therefor we decided do not differentiate between ancient and modern Greek and use one language code for everthing.
    • Hebrew: he
    • Italian: it
    • Latin: la
    • Portuguese: pt
    • Spanish: es
  • Usually, only phrases are marked, but in case of other character sets (e.g., Greek) also single words or letters. Single words in Latin or not tagged, due to their frequent use in 17th-century German. Moreover, repeating phrases in signatures (“Sandrart delineavit”, “Waldreich sculpsit”, “Cum Gratia et Privilegio”) are not tagged.

Errors, unclear words etc.

  • Typesetting errors: <choice><sic>mistake</sic> <corr>correction</corr></choice>
  • Words or parts of words, which are superfluent in the source text: <del>Word</del>
  • Not readable: <gap reason="..."/>. If inside a word, the word is enclosed by <unclear>...</unclear>. <gap/> is also used inside <sic>...</sic> for encoding glyphs not available in UTF-8.
  • Not readable without ambiguity: <unclear reason="...">...</unclear>, partially with additional annotation.
  • Special characters / symbols which cannot be represented with UTF-8 have been inserted as <graphic/> element.
  • Additionally, some Unicode characters are used (for instance, U2183), which are replaced with images in the edition to avoid client-side problems.
  • In Latin words containing a “q” with accent, this has been transcribed as a simple “q”, and the word is tagged using <reg>.

Miscellaneous

  • Quotation: <q>...</q>
  • Explicit dates are enclosed by <date value="..."> elements. These value of the “value” attribute contains the date given by Sandrart, which has not been validated.