Abstract

Scientific economic activities document the utilization of research funds, which form a critical component of scientific research. Detecting potential risk behaviors from scientific economic activities is crucial to risk management for research institutions. Most of the existing attempts, however, tackle the problem with traditional machine learning algorithms, which rely on the manual feature extraction. Undoubtedly, these methods cannot extract complex semantic features or fuse information from hybrid data for effective risk identification. To overcome these challenges, in this paper, we propose a novel Risk Identification model for Scientific Economic activities from HYbrid data (HY-RISE), which incorporates both textual and structured data. Firstly, we use a pretrained BERT module to capture the semantic information from textual data. After that, we introduce a BiGRU module to augment the contextual information in semantic embeddings. Finally, we use a shallow neural network to fuse the augmented semantic representation with other discrete features to obtain the final representation. Experimental results on the real reimbursement dataset demonstrate that HY-RISE apparently outperforms existing models in terms of effectiveness and robustness for risk identification.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call