Abstract

This study develops, implements, and evaluates a multi-label text classification algorithm called the multi-label categorical K - nearest neighbor (ML-CKNN). The proposed algorithm is designed to automatically identify 25 types of risk factors with specific meanings reported in Section 1A of SEC Form 10-K. The idea of ML-CKNN is to compute a categorical similarity score for each label by the K-nearest neighbors in that category. ML-CKNN is tailored to achieve the goal of extracting risk factors from 10Ks. The proposed algorithm can perfectly classify 74.94% of risk factors and 98.75% of labels. Moreover, ML-CKNN is empirically shown to outperform ML-KNN and other multi-label algorithms. The extracted risk factors could be valuable to empirical studies in accounting or finance.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.