closed 4.2.1.1 2005-05-20T16:20:06.396-06:00 2004-06-12T21:32:11.307-06:00 TSD4.2 Archaeological Markup Language (ArchaeoML), version 0.9, February 2006. Created by David Schloen of the University of Chicago. DiscourseUnit document type. A DiscourseUnit document represents a meaningful component of a text as understood by a particular editor, e.g., the whole text, a paragraph, a sentence, a clause, a phrase, a word, or a morpheme. Any number of properties, each consisting of a variable-value pair, can be attached to a discourse unit. The variables used for this will usually be linguistic or grammatical variables, but they need not be. Properties can be recursively nested in a logical hierarchy. The original written form of a text (whether it is an ancient manuscript or inscription or a modern printed text) is normally represented by EpigraphicUnit documents to which the DiscourseUnit document is linked. In some cases, there is no written text so that the concept of epigraphic unit does not apply, but it is still necessary to have a representation of the discourse unit in a non-Latin alphabet (e.g., Devanagari) in addition to the Latin transcription. The choice is therefore allowed to use a transliteration string here instead of links to epigraphic units. The epigraphicUnits element indicates the individual epigraphic units that correspond to a discourse unit via links to a sequence of EpigraphicUnit documents. A transliteration is a representation of a text or part of a text in its original form, in contrast to a transcription, which is a more-or-less phonetic representation of the text in the Latin alphabet (with special diacritics and IPA characters used as needed). In some cases a transliteration will be a sign-by-sign rendering of nonalphabetic characters, perhaps using a non-Latin font (e.g., Egyptian hieroglyphs, Chinese characters). Unicode encodings of nonalphabetic characters will be used where available. If the original text is alphabetic, the transliteration will use the appropriate Unicode characters for the relevant alphabetic script (e.g., Latin, Devanagari, Cyrillic, Arabic, Hebrew, etc.). Any number of properties, each consisting of a variable-value pair, can be attached to a discourse unit. The variables used for this will usually be linguistic or grammatical variables, but they need not be. Properties can be recursively nested in a logical hierarchy. The author of an alternate reading is indicated by a link to a Person document. The original written form of a text (whether it is an ancient manuscript or inscription or a modern printed text) is normally represented by EpigraphicUnit documents to which the DiscourseUnit document is linked. In some cases, there is no written text so that the concept of epigraphic unit does not apply, but it is still necessary to have a representation of the discourse unit in a non-Latin alphabet (e.g., Devanagari) in addition to the Latin transcription. The choice is therefore allowed to use a transliteration string here instead of links to epigraphic units. A transliteration is a representation of a text or part of a text in its original form, in contrast to a transcription, which is a more-or-less phonetic representation of the text in the Latin alphabet (with special diacritics and IPA characters used as needed). In some cases a transliteration will be a sign-by-sign rendering of nonalphabetic characters, perhaps using a non-Latin font (e.g., Egyptian hieroglyphs, Chinese characters). Unicode encodings of nonalphabetic characters will be used where available. If the original text is alphabetic, the transliteration will use the appropriate Unicode characters for the relevant alphabetic script (e.g., Latin, Devanagari, Cyrillic, Arabic, Hebrew, etc.). The transcription element stores either a phonemic transcription in the usual sense, or in the case of a less well understood language, it contains the best graphic or written representation of the discourse unit in terms of a phonologically defined character set. The type attribute indicates the level of discourse: paragraph, sentence, phrase, word, morpheme. nestedVariableIndex nestedValueIndex nestedDocIdIndex nestedLinkIndex