Abstract

Socio-economic indicators are powerful instruments for measuring economic conditions. Extracting them can help people grasp the economy trend and make decisions. Traditional machine learning methods for indicator extraction rely heavily on handcrafted features, which costs a large amount of human effort. While, deep learning methods can solve this problem but require a huge amount of labeled data, which is the trickiest challenge as the labeled data in indicator extraction task is quite rare. In this paper, we use a BERT-based model to deal with the challenges in this task. The model firstly represents input text with BERT, taking advantage of the strong ability of BERT to capture generic language features. Then, it fine-tunes the pre-trained model through the labeled data in our indicator extraction task to learn the specific features. Finally, they go through a conditional random field (CRF) layer to get the predicted tags across output token labels. In this way, our model does not require too much labeled data but it can automatically and sufficiently capture the language features of input text. Additionally, this paper also constructs a middle scale dataset for fine-tuning process and evaluates our model on it. The results demonstrate that the BERT-based model is superior to some strong baselines.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.