Abstract

This chapter discusses popular non-parametric methods in corpus linguistics: conditional inference trees and conditional random forests. These methods, which allow the researcher to model and interpret the relationships between a numeric or categorical response variable and various predictors, are particularly attractive in ‘tricky’ situations, when the use of parametric methods (in particular, regression models) can be problematic, for example, in the situations of ‘small n, large p’, complex interactions, non-linearity and correlated predictors. For illustration, the chapter discusses a case study of T and V politeness forms in Russian based on a corpus of film subtitles.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call