Abstract

Cross-language code retrieval is necessary in many real-world scenarios. A major application is program translation, e.g., porting codebases from an obsolete or deprecated language to a modern one or re-implementing existing projects in one’s preferred programming language. Existing approaches based on the translation model require large amounts of training data and extra information or neglects significant characteristics of programs. Leveraging cross-language code retrieval to assist automatic program translation can make use of Big Code. However, existing code retrieval systems have the barrier to finding the translation with only the features of the input program as the query. In this paper, we present BigPT for interactive cross-language retrieval from Big Code only based on raw code and reusing the retrieved code to assist program translation. We build on existing work on cross-language code representation and propose a novel predictive transformation model based on auto-encoders. The model is trained on Big Code to generate a target-language representation, which will be used as the query to retrieve the most relevant translations for a given program. Our query representation enables the user to easily update and correct the returned results to improve the retrieval process. Our experiments show that BigPT outperforms state-of-the-art baselines in terms of program accuracy. Using our novel querying and retrieving mechanism, BigPT can be scaled to the large dataset and efficiently retrieve the translation.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.