Abstract

Third-party libraries always evolve and produce multiple versions. Lucene, for example, released ten new versions (from version 7.7.0 to 8.4.0) in 2019. These versions confuse the existing code search methods to retrieve the source code that is not compatible with local programming language. To solve this issue, we propose DCSE, a deep code search model based on evolving information (i.e. evolved code tokens and evolution description). DCSE first deeply excavates evolved code tokens and evolution description in the code evolution process; then it takes evolved code tokens and evolution description as one feature of source code and code description, respectively. With such fuller representation, DCSE embeds source code and its code description into a high-dimensional shared vector space, and makes the cosine distance of their vectors closer. For the ever-evolving third-party libraries like Lucene, the experimental results show that DCSE could retrieve the source code that is compatible with local programming language, it outperforms the state-of-the-art methods (e.g. CODEnn) by 56.9–60.9[Formula: see text] in RFVersion. For the rarely-evolving third-party libraries, DCSE outperforms the state-of-the-art methods (e.g. CODEnn) by 4–11[Formula: see text] in Precision.

Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call