Abstract

In corpus linguistics, but also in computational linguistics and information retrieval, there is an increasing demand for the automatic classification of large amounts of text(s). In his research, Biber uses the Multi-Feature/Multi-Dimension (MF/MD) method to obtain a classification of English texts. A major disadvantage of his approach is the heavy reliance on the frequency count of complex grammatical features which are hard to retrieve automatically. In this paper, we investigate whether Biber’s MF/MD method can be used for automatic text classification. For this purpose, the MF/MD method is applied to the ICE-GB corpus, using three different sets of linguistic features. The results indicate that automatic text classification is indeed feasible using word class tags as input for the MF/MD method.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.