Abstract

Expression-Level Information Extraction is a challenging Natural Language Processing (NLP) task that aims to retrieve important information from the linguistic documents. However, there still lacks the up-to-date data sources for accelerating the Expression-Level Information Extraction, especially in the field of Chinese financial high technology. To fill this gap, we present Fintech Key-Phrase: a human-annotated Chinese financial high technology field related key-phrase dataset, which contains more than 12K paragraphs together with the annotated domain-specific key-phrases. We extract the publicly released reports on Chinese Management’s Discussion and Analysis (CMD &A) from the well-known Chinese Research Data Services Platform (CNRDS) and then filter the Financial High-Tech related reports. The Financial High-Tech key-phrases are annotated through pre-defined philosophy guidelines to control the annotation quality. To demonstrate that our released Fintech Key-Phrase helps retrieve valuable information in the field of Chinese financial high technology, we adopt several superior Information Retrieval systems as representative baselines to validate its significance and report the performance statistics correspondingly. We hope this dataset can facilitate the scientific research and further exploration in the Chinese Financial High-Tech domain. We have made our Fintech Key-Phrase dataset and experimental code of the adopted baselines accessible at Github ( https://github.com/albert-jin/Fintech-Key-Phrase/ ). To motivate newcomers to get involved in the Information Retrieval of the Chinese financial high technology field, we have built an open website ( https://albert-jin.github.io/FintechKP-frontend/ ) and a real-time information retrieval API tool ( https://31863ew564.zicp.fun/information_retrieval/ ).

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call