Abstract
This article discusses the theoretical and practical problems related to encoding manuscript abbreviations in TEI P5 XML. Encoding them presents a challenge, because the correspondence between the orthographic sign indicating abbreviation and what the sign stands for is more complex than in non-abbreviated words. The article consists of a review of the terminology used to describe the abbreviations, looking at their history from antiquity to abolition and taxonomies of abbreviations in paleographical handbooks between 1745 and 2007. It discusses the editorial treatment of abbreviations in printed editions and relates them to the terminology used in the handbooks, offering criticism of it from a linguistic and editorial point of view and how to best represent the abbreviations in TEI P5 mark-up. Traditional taxonomies of abbreviation divide the abbreviations into groups based on the shape of the abbreviating symbol or the position of the abbreviated content. Some of the distinctions, such as the one between contractions and suspensions are not at all relevant for digital encoding. However, the system outlined in this article allows for tagging them in a way which will enable quantitative corpus study of them. The data comes mainly from a digital edition of The Trinity Seven Planets, a TEI P5-based digital edition.