Evaluation of Normalization Techniques in Text Classification for Portuguese

Merley Da Silva Conrado,Solange Oliveira Rezende,Víctor Antonio Laguna Gutiérrez

doi:10.1007/978-3-642-31137-6_47

Abstract

Text classification is an important task of Artificial Intelligence. Normally, this task uses large textual datasets whose representation is feasible because of normalization and selection techniques. In the literature, we can find three normalization techniques: stemming, lemmatization, and nominalization. Nevertheless, it is difficult to choose the most suitable technique for the text classification task. In this paper, we investigate this question experimentally by applying five different classifiers to four textual datasets in the Portuguese language. Additionally, the classification results are evaluated using unigrams, bigrams, and the combination of unigrams and bigrams. The results indicate that, in general, the number of terms obtained by each of the cases and the comprehensibility required in the results of the classification can be used as criteria to define the most suitable technique for the text classification task.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

R Discovery Prime

R Discovery Prime

Evaluation of Normalization Techniques in Text Classification for Portuguese

Abstract

Talk to us

Similar Papers

Lead the way for us

Similar Papers

How to Improve Text Summarization and Classification by Mutual Cooperation on an Integrated Framework
Hyoungil Jeong ... Jungyun Seo
Expert Systems With Applications | VOL. 60
Hyoungil Jeong, et. al.Hyoungil Jeong ... Jungyun Seo
10 May 2016
Expert Systems With Applications | VOL. 60

A Review of Metaheuristic Optimization Techniques in Text Classification
Cem Kaya ... Mitat Uysal
International Journal of Computational and Experimental Science and Engineering | VOL. 10
Cem Kaya, et. al.Cem Kaya ... Mitat Uysal
30 Apr 2024
International Journal of Computational and Experimental Science and Engineering | VOL. 10

Survey on supervised machine learning techniques for automatic text classification
Ammar Ismael Kadhim
Artificial Intelligence Review | VOL. 52
Ammar Ismael KadhimAmmar Ismael Kadhim
19 Jan 2019
Artificial Intelligence Review | VOL. 52

A Comparative Approach of Dimensionality Reduction Techniques in Text Classification
S. Rahamat Basha ... J. K. Rani
Engineering, Technology & Applied Science Research | VOL. 9
S. Rahamat Basha, et. al.S. Rahamat Basha ... J. K. Rani
01 Dec 2019
Engineering, Technology & Applied Science Research | VOL. 9

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Evaluation of Normalization Techniques in Text Classification for Portuguese

Abstract

Talk to us

Similar Papers