Abstract

The extraction of implicit citations becomes more important since it is a fundamental step in many other applications such as paper summarization, citation sentiment analysis, citation classification, etc. This paper describes the limitations of previous works in citation extraction and then proposes a new approach which is based on topic modeling and word embedding. As a first step, our approach uses LDA technique to identify the topics discussed in the cited paper. Following the same idea of Doc2Vec technique, our approach proposes two models. The first one called Sentence2Vec and it is used to represent all sentences following an explicit citation. This sentences are candidates to be implicit citation sentences. The second model called Topic2Vec, used to represent the topics covered in the cited paper. Based on the similarity between Sentence2Vec and Topic2Vec representations we can label a candidate sentence as implicit or not.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call