Abstract

<p>Geospatial knowledge in massive academic papers can provide knowledge services such as location-based research hotspot analysis, spatio-temporal data aggregation, research results recommendation, etc. However, geospatial knowledge often exists implicitly in literature resources in unstructured form, which is difficult to be directly accessed and mined and utilized for rapid production of massive thematic maps. In this paper, we take the geospatial knowledge of the area studied as an example and introduce its extraction method in detail. An integrated feature template matching and random forest classification algorithm is proposed for accurately identifying research areas from the abstract texts of academic papers and producing thematic maps. Firstly, the precise recognition of geographical names is achieved step by step based on BiLSTM-CRF algorithm and improved heuristic disambiguation method; then, the area studied is extracted by the designed integrated feature recognition template of area studied using random forest classification algorithm, and a fast thematic map is designed for the knowledge of area studied, topic and literature. The experimental results show that the area studied recognition accuracy can reach 97%, the F-value is 96%, and the recall rate reaches 96%, achieving high accuracy and high efficiency of area studied extraction in text. Based on the geospatial knowledge, the thematic map can achieve the effect of fast map formation and accurate expression.</p> <p> </p>

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call