Abstract

Learning Bayesian belief networks (BBN) from corpora and support vector machines (SVM) have been applied to the automatic acquisition of verb subcategorization frames for Modern Greek. We are incorporating minimal linguistic resources, i.e. basic morphological tagging and phrase chunking, to demonstrate that verb subcategorization, which is of great significance for developing robust natural language human computer interaction systems, could be achieved using large corpora, without having any general-purpose, syntactic parser at all. In addition, apart from BBN and SVM, which have not previously used for this task, we have experimented with three well-known machine learning methods (feedforward backpropagation neural networks, learning vector quantization and decision tables), which are also being applied to the task of verb subcategorization frame defection for the first time. We argue that both BBN and SVM are well suited for learning to identify verb subcategorization frames. Empirical results will support this claim. Performance has been methodically evaluated using two different corpora types, one balanced and one domain-specific in order to determine the unbiased behaviour of the trained models. Limited training data are proved to endow with satisfactory results. We have been able to achieve precision exceeding 80% on the identification of subcategorization frames which were not known beforehand.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call