The study is concerned with referential choice in spoken and written discourse in the Russian language. I consider referential choice to consist in a threefold opposition between full noun phrases, pronouns, and zero noun phrases. The study is based on discourses each of which was presented by its narrator twice, namely in the spoken and written forms. In each story, all the noun phrases were identified and described according to 29 parameters. I trained logistic regression models and decision trees on the collected samples and analyzed factor importance diagrams built on the basis of the decision trees. The interpretation of the models and diagrams shows that some factors have different impact on referential choice in spoken and written discourses, for instance, grammatical role, semantic hyperrole and sloppy identity between the anaphor and the antecedent. Besides, the models also demonstrate that the sets of significant factors for the two samples are not identical: in particular, the referent’s animacy and the anaphor’s semantic hyperrole are present solely in the decision tree for written discourse.
Read full abstract