Abstract

Abstract We investigate internal and stylistic factors affecting binary and ternary relativizer choice in subject (that vs which) and non-subject (that vs which vs zero) relative clauses. We employ a novel methodological approach to predicting relativizers: Bayesian regression modeling with the dimensional reduction of model inputs via factor analysis. Our factor analysis is motivated by the high degree of redundancy and collinearity in natural language data, while Bayesian regression models are robust to effects of data sparseness and (near) separation. We find that in both types of relative clauses, the more marked variant (which) is preferred in complex contexts, while the unmarked variant (that, or zero in NSRCs) is favored in contexts where the relative clause is short and more fully integrated with the NP it modifies. We also find that use of which is somewhat more sensitive to stylistic considerations in subject than in non-subject relative clauses, and that which correlates most strongly with features associated with lexical density, e. g. ‘nouniness’, rather than those often associated with formality, e. g. passivization and sentence length.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call