Abstract

Background/Objectives: In Korea, much effort and budget were spent to improve national R&D information management. However yet, project summaries of national R&D are not accurate enough to be utilized.Methods/Statistical analysis: To examine the accuracy of project summaries, Levenshtein Distance Algorithm (LDA) was applied. LDA is expected to extract improper project summaries of which some parts of sentences are repeatedly used. To evaluate how the algorithm performs with national R&D information in Korea, project summaries of 53,492 national R&D projects that were conducted in 2014 were used.Findings: Unlike other algorithms, LDA was able to detect project summaries consisted of repeatedly used phrases. According to the test with LDA, from 53,492 cases, 3,445 projects had inaccurate contents in project summaries. In details, 2,707 projects had improper research objective, while 712 projects and 26 projects had improper contents in research summary and expected impact, respectively. Although the algorithm allowed extracting repeatedly used phrases, it had problems of time; thus, it was only applied offline. Also, a research had to confirm once more to verify the accuracy of the result.Improvements/Applications: This paper applied LDA to detect inappropriate project summaries. The result implies that by applying LDA, the quality of the information can be improved to facilitate the utilization.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.