Abstract
This research studies the problem of email overload and proposes a system that automatically detects whether the email is read or do. The goal of our research is to test if we could automate both the features extraction and sentence classification phases by using word embedding. We propose to achieve this goal using a simple feature-based adaptation approach, where email's sentences are represented as dense numeric vectors of reduced dimensionality using either word embeddings or sentence encodings. Given that several types of word embeddings and sentence encodings exist, we compare email's sentence representations corresponding to different word embeddings and sentence encodings with the goal of understanding what embeddings/encodings are more suitable for use in the task of detecting the intent of an email. Our experimental results using three different types of embeddings: context-free word embeddings (word2vec and GloVe), contextual word embeddings (ELMo and BERT), and sentence embeddings (DAN-based Universal Sentence Encoder and Transformer-based Universal Sentence Encoder) suggest that the email's sentences representations based on ELMo embeddings produce better results than the representations that use other embeddings. We achieved an accuracy of 90.10%, comparing with word2vec (82.02%), BERT (58.08%), DAN-based USE (86.66%), and Transformer-based USE (88.16%).
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.