This paper undertakes a corpus-based linguistic investigation of the spelling variation in 16th century Slovene both from the diachronic and synchronic points of view. The investigation is based on a manually annotated sample (approx. 14,000 word tokens) from Primož Trubar’s Ta pervi deil tiga Noviga teſtamenta, 1557, and Hiſhna poſtilla, 1595, and Jurij Juricic’s Poſtilla, 1578, and it concentrates on clitics and clitic-like elements. Statistical analysis, based on comparison of the spelling conventions of the early modern period to those of contemporary Slovene using normalised forms of the originals, where we observe cases where one orthographic word is nowadays written as two or more words (1–n mapping) or vice-versa (n–1 mapping), shows that the overall percentage of split and joined word tokens is 5.7%, with JPo 1578 having the highest percentage, and TPo 1595 the lowest, less than half of that of JPo 1578. Of these, the vast majority is for cases where a word is now split. The most predominant among the bound words are non-syllable prepositions v ‘in(to)’, k ‘to’, and z ‘with’, followed by negative proclitic ne ‘not’, enclitic particle li ‘whether, if’ and in rare instances conditional particle bi, reflexive particle se, na ‘on’, ob ‘at, by’, pri ‘at, beside’ and za ‘for, behind’ (the absolute numbers of specific clitics partially correlate with the prevalence of bound variants in comparison with the freestanding variants of those clitics, with the most frequent being predominantly bound while the least frequent are predominantly freestanding). Individual instances of two accented words written together can be attributed to German influence (figino_drevo, der Pfeigenbaum ‘fig tree’).The cases where one modernised word correlates to two original words are, with the exception of superlative adjective/adverb prefix naj-/nar- ‘the most’ that is orthographically bound with its root in about 25% of instances, sporadic or can be identified as errors in the original books. Of interest are also cases when beginnings of words that are homonymous with non- or one syllable prepositions are separated from the remainder of the word with an apostrophe (eg. s’_nameinja ‘signs’, s’_derſhati ‘to endure’, do_bruta ‘goodness’, sa_doſti ‘enough’). The normalisation also enables the identification of the orthographical variants of the most commonly bound clitics, i. e. non-syllable prepositions k, z and v. K and its allomorph /h/ have 5 attested spelling variants, of which one is limited to hosts starting with a v-. For z with a voiced allomorph /z/ and voiceless allomorph /s/ three variant spellings were discovered that only partially correspond with a voiceless/voiced distinction of the initial sound of the host word, and the cases of merging with the host that begins with s-/z- were identified. Additional positional spellings probably represent other allomorphs: for palatalized /ž/ in front of a palatal n and , >ſo/so> for syllabified /za/, /zo/. The preposition v shows the highest degree of orthographical variation of all analysed words as it has 10 different spellings: general bound and and freestanding ; , and in front of a vowel; and attested only in front of a v-, as well as and merged with the initial v- of the host.The analysis of spelling variation in non-syllable prepositions showed that even a relatively limited hand-corrected annotated sample enabled identification of majority of spelling variants identified in previous works, while with the use of noSketch Engine tool further information about their relative frequency and distribution was obtained. As the hand-corrected corpus is expanded such research will yield even more relevant information for the study of the 16th century Slovene literary language that will significantly supplement existing findings (based on traditionally collected examples) with the help of a large amount of statistically relevant data.
Read full abstract