Abstract
In this paper, we describe a framework for developing probabilistic classifiers in natural language processing. Our focus is on formulating models that capture the most important interdependencies among features, to avoid overfitting the data while also characterizing the data well. The class of probability models and the associated inference techniques described here were developed in mathematical statistics, and are widely used in artificial intelligence and applied statistics. Our goal is to make this model selection framework accessible to researchers in NLP, and provide pointers to available software and important references. In addition, we describe how the quality of the three determinants of classifier performance (the features, the form of the model, and the parameter estimates) can be separately evaluated. We also demonstrate the classification performance of these models in a large-scale experiment involving the disambiguation of 34 words taken from the HECTOR word sense corpus (Hanks 1996). In 10-fold cross-validations, the model search procedure performs significantly better than naive Bayes on 6 of the words without being significantly worse on any of them.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.