Abstract

Abstract We investigate internal and stylistic factors affecting binary and ternary relativizer choice in subject (that vs which) and non-subject (that vs which vs zero) relative clauses. We employ a novel methodological approach to predicting relativizers: Bayesian regression modeling with the dimensional reduction of model inputs via factor analysis. Our factor analysis is motivated by the high degree of redundancy and collinearity in natural language data, while Bayesian regression models are robust to effects of data sparseness and (near) separation. We find that in both types of relative clauses, the more marked variant (which) is preferred in complex contexts, while the unmarked variant (that, or zero in NSRCs) is favored in contexts where the relative clause is short and more fully integrated with the NP it modifies. We also find that use of which is somewhat more sensitive to stylistic considerations in subject than in non-subject relative clauses, and that which correlates most strongly with features associated with lexical density, e. g. ‘nouniness’, rather than those often associated with formality, e. g. passivization and sentence length.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.