Abstract

This study constructs a comprehensive index to effectively judge the optimal number of topics in the LDA topic model. Based on the requirements for selecting the number of topics, a comprehensive judgment index of perplexity, isolation, stability, and coincidence is constructed to select the number of topics. This method provides four advantages to selecting the optimal number of topics: (1) good predictive ability, (2) high isolation between topics, (3) no duplicate topics, and (4) repeatability. First, we use three general datasets to compare our proposed method with existing methods, and the results show that the optimal topic number selection method has better selection results. Then, we collected the patent policies of various provinces and cities in China (excluding Hong Kong, Macao, and Taiwan) as datasets. By using the optimal topic number selection method proposed in this study, we can classify patent policies well.

Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.