Email messaging is the most common way of providing effective communication between internauts. Consequently, the total sent and received emails count will be increased. But, the internaut can't remember all such emails. Even though email thread identification approaches give satisfactory benefits to the internauts, but they may fail to alert them for a cause to identify the sentiments behind an email thread. To address, this issue Probabilistic Latent Semantic Analysis clustering algorithm has been used in this paper to identify the email sentiment thread sequence. The sentiment and the thread sequence within the emails have been discovered as clustering sentiment polarity and temporal categories with the help of PLSA clusters. At the initial stage, we used three feature extraction methods, latent semantic analysis (LDA), bag of words (BoW), TF-IDF and SentiWordNet (SWN) lexicon for generating sentiment features of email. Next, Probabilistic Latent Semantic Analysis algorithm is used to form email clusters based on sentiment features. Thus, it helps to identify thread sentiment and sequence of sentiment threads. Email threads give a mechanism by which any user will be able to find out the sequence in the thread on the basis of sentiment analysis of email related to a specific set of communication during a specific time period. Various parameters evaluation measures have been considered in this work to evaluate the proposed model such as accuracy, precision, recall and F-measure, and the proposed algorithm is compared with other standard algorithms. Furthermore, a statistical test has also been performed.
Read full abstract