Using part-of-speech tags as deep-syntax indicators in determining short-text semantic similarity

Vuk Batanovic,Dragan Bojic

doi:10.2298/csis131127082b

Using part-of-speech tags as deep-syntax indicators in determining short-text semantic similarity

Vuk Batanovic, Dragan Bojic

Open Access

https://doi.org/10.2298/csis131127082b

Copy DOI

Journal: Computer Science and Information Systems	Publication Date: Jan 1, 2015
Citations: 49	License type: CC BY-NC-ND 4.0

#Advanced Tools #Limited Resources + Show 8 more

Abstract
Full-Text PDF
Similar Papers

Abstract

This paper presents POST STSS, a method of determining short-text semantic similarity in which part-of-speech tags are used as indicators of the deeper syntactic information usually extracted by more advanced tools like parsers and semantic role labelers. Our model employs a part-of-speech weighting scheme and is based on a statistical bag-of-words approach. It does not require either hand-crafted knowledge bases or advanced syntactic tools, which makes it easily applicable to languages with limited natural language processing resources. By using a paraphrase recognition test, we demonstrate that our system achieves a higher accuracy than all existing statistical similarity algorithms and solutions of a more structural kind.

Full Text