Abstract

Although the accurate potential for growth prediction is very important for Government grants and contributions programs to better support Small and Medium-sized Enterprises (SMEs), it is a challenging task due to the data heterogeneity (both structured data and free text data bilingual in English and French), the class imbalance issue, and the difficulties in efficient feature learning. To address these challenges, this paper presents a novel BERT-TCN model for portfolio predictions in government funding programs, with the following key contributions. First, we describe the application of a novel architecture to a prediction task involving sequential, structured, partially quantitative input data and free text input data. Specifically, our novel model predicts the growth of firms receiving government funding for innovation. Our model also deals with class imbalance in the data and the difficulties in efficient feature learning. Our model integrates a Transformer model, i.e., BERT, for text modeling with a Temporal Convolutional Network (TCN) for sequential prediction. Second, we also developed various performance evaluation criteria in Section 4.3, allowing comprehensive assessments of the proposed approach from both the machine learning perspective and funding program-specific perspective. Third, the importance of features (both text and numerical features) is quantified and evaluated, allowing insights into how different features contribute to the prediction and explainability of the proposed model. The proposed approach is trained and tested on a large dataset from a rich database, demonstrating that the proposed approach can greatly help individual human experts improve their results.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call