The Old German Reference Corpus collects and annotates all Old High German and Old Low German texts. After its completion the corpus will contain about 650 000 words. The corpus refers to printed text editions of handwritten texts.
All word forms are recorded on word-level as well as on graphem-level. Next to the edition's form the spelling of the manuscripts and the according standard word form will be listed in the corpus. Graphic features like rubrification in manuscripts, italisation, transliterated diacritics or ligatures are commented in the corpus. All word forms are lemmatized. The linguistic annotation contains word classes, morphological information and sentence specifications. Furthermore line breaks, paragraphs and other means of textual formation as well as specifications on verse structure and rhyme position are included in the annotation.
Moreover every listed text is given header information containing all relevant linguistic and literary data, for example time of origin, linguistic area and context of tradition.
If you have any further questions do not hesitate to contact us.