Abstract

Disclosure is the soul of supervision, the bridge between companies and investors, and can help public fully understand companies’ business operations. The disclosure quality of listed companies’ annual reports affects security market efficiency and investor rights protection. Currently, there are issues including but not limited to contradictory information and avoidance of important problems. Moreover, the qualities of disclosure reports from different companies vary greatly. This paper is to focus on annual reports of listed companies, dividing them into two categories based on the assessment results of disclosure by the Shenzhen Stock Exchange, and conducting the research on textual characteristics. Firstly, three characteristic indicators, tone, readability, and file size of each annual report are analyzed and compared, with validation. To explore the impact of characteristic indicators on text classification, these three indicators are introduced into text models, constructing comprehensive models. To achieve dimension reduction for training models, feature selection is performed by using Chi-square statistics. Different lengths keywords dictionaries are constructed. In conclusion, prediction performance of classifier models can be improved or maintained after introducing indicators, with the random forest model having the best improvement after introduction of indicators.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call