Abstract
Abstract European Union reallocates its money to their member states using different kinds of funding. EU member states categorize EU funding projects using their own categorization system. While EU prepared an integrated European categorization system, many EU members do not use it in their reports. This hinders a straightforward fiscal analysis. The article aims at an automatic support for categorization of EU funding projects by Machine Learning. The experiments showed that Support Vector Machines (SVM) is the top performance Machine Learning algorithm for this task. We experimented with the SVM classifier and the results disclosed that by employing this approach we can classify EU funding projects using a lexical description better than a baseline (i.e. the classification to a major class). Further, we experienced that the approach using the natural language translator outperforms the approach using the word sense disambiguation. Finally, we investigated the influence of the length of project description on the performance of the classifier. The results showed that while there was a positive correlation between the length of project description and the classifier performance for project descriptions in English, in the case of project description in Non-English languages the classifier performed better for shorter project descriptions. In future, we plan to build a new online application which would use the classifier on the back-end and a user would get a category recommendation on the front-end using a visualization of the EU categorization system.
Published Version
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.