Abstract
This article proposes a novel sentence semantic equivalence identification (SSEI) method by using the semantic difference features between sentences. The lexical differences of a sentence pair are first extracted, and the bidirectional long short term memory (BiLSTM) network is then applied on them to generate the semantic difference representations. Finally, an efficient gate mechanism is proposed to integrate the semantic differences with existing models (called base model) to enhance their encoding capability in the SSEI task. Exhaustive experiments conducted on the standard Quora corpus, and the Large-scale Chinese Question Matching Corpus (LCQMC) show that the proposed gated semantic difference (GSD) method brings significant improvement for different existing state-of-the-art models. When the bidirectional encoder representations from transformers model (BERT) is used as the base model, the accuracy for SSEI on Quora is improved from 90.63% to 91.98%, and the F1 score on the LCQMC is improved from 87.0% to 87.7%, which outperforms the best-published results.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have