Abstract
When reusing or modifying code, developers need to understand the implicit knowledge behind a piece of code in addition to the literal meaning of code. Such implicit knowledge involves related concepts and their explanations. Uncovering and understanding the implicit knowledge in code are challenging due to the extensive use of abbreviations, scattered expressions of concepts, and ambiguity of concept mentions. In this paper, we propose an automatic approach (called CoLiCo) that can uncover implicit concepts in code and link the uncovered concepts to Wikipedia. Based on a trained identifier embedding model, CoLiCo identifies Wikipedia concepts mentioned in a given code snippet and excerpts a paragraph-level explanation from Wikipedia for each concept. During the process, CoLiCo resolves identifier abbreviation (i.e., concepts mentioned in the form of abbreviations) and identifier aggregation (i.e., concepts mentioned by an aggregation of multiple identifiers) based on identifier embedding and mining of identifier abbreviation/aggregation relations. Experimental study shows that CoLiCo outperforms a general entity linking approach by 38.7% in the correctness of concept linking and identifies 96.7% more correct concept linkings on a dataset with 629 code snippets. The concept linking is significant for program understanding in 54% code snippets. Our user study shows that CoLiCo can significantly shorten the time and improve the correctness in code comprehension tasks that intensively involve implicit knowledge.
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.