Abstract

One of the purposes of detecting the crisis related tweets is the ability to single out the tweets that provide information about the helps needed and offered. Classification of such tweets is difficult because of the unavailability of sufficient annotated tweets in those categories. To facilitate such classifications, a domain and event adaptive augmentation approach is proposed. The main objective of the research is to enhance the classification of crisis related tweets that have less training samples. The proposed algorithms are designed to integrate the innate domain and event information during the selection of words for augmentation. Components such as CrisisLex lexicon, Word2Vec embeddings and WordNet are utilized for the proposed augmentation. Experimentation is carried out to substantiate the benefits of augmentation. Results indicate increased performance of the classifier when provided with the expanded dataset including the augmented and original tweets. To combat the problem of overfitting and class imbalance arising due to the lesser training samples, a novel tweets augmentation algorithm can be utilized. The advantage in the proposed algorithms is the ability to retain the structure and inherent nature of the tweets during the augmentation.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call