Learning sentence representation with the full semantics of a document is a challenge in natural language processing problems because if the semantic representation vector of the sentence is suitable, it will increase the performance of finding similar question problems. In this paper, we propose implementing a series of LSTM models with different ways of extracting sentence representations and applying them to question retrieval to exploit the hidden semantics of sentences. These methods give sentence representation from hidden layers of the LSTM model. The results show that the technique using a combination of both Max Pooling and Mean Pooling gives the highest results on the 2017 SemEval dataset for the problem of finding similarity questions. Keywords: LSTM; Deep Learning; NLP; QA; learning sentence representation; CQA.
Read full abstract