Abstract

In this paper, we propose an efficient algorithm based on the concept of multiobjective optimization (MOO) for performing feature selection and parameter optimization of any machine learning technique. Feature and parameter combinations have significant effect to the accuracy of the classifier. We perform feature selection and parameter optimization for four different classifiers, namely conditional random field, support vector machine, memory based learner and maximum entropy. The proposed algorithms are evaluated for solving the problems of named entity recognition, an important component in many text processing applications. Currently we experiment with four different languages, namely Bengali, Hindi, Telugu and English. At first the proposed MOO based technique is used to determine the appropriate features and parameters. For each of the classifiers, the algorithm produces a set of solutions on the final Pareto optimal front. Each solution represents a classifier with a particular feature and parameter combination. All these solutions are thereafter combined using a MOO based classifier ensemble technique. Evaluation results show that the proposed approach attains the F-measure (harmonic mean of recall and precision) values of 90.48, 90.44, 78.71 and 88.68 % for Bengali, Hindi, Telugu and English, respectively. We also show that for all the experimental settings the proposed feature and parameter optimization technique performs reasonably better than the baseline systems, developed with random feature subsets. Comparisons with the existing works also show the efficacy of our proposed algorithm.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.