Abstract
In this paper, we address the issue of using small text datasets for learning of neural networks. We explore the method that is used with image and sound datasets to augment data for increasing the performance of models. We then leverage this data augmentation technique to expand the training set of textual data. A great challenge in our dataset is that the amount of data is insufficient for training models. For this reason, we propose a method for augmenting text data specifically for Thai language which is based on Text Similarity and using the model to determine the semantic relationship between two sentences. The experimental results indicated that our proposed method is able to improve the performance of text classification.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have