AbstractPython is a popular programming language, and a large part of its appeal comes from diverse libraries and extension modules. In the bloom of data science and machine learning, Python frontend with C/C++ native implementation achieves both productivity and performance and has almost become the standard structure for many mainstream software systems. However, feature discrepancies between two languages such as exception handling, memory management, and type system can pose many safety hazards in the interface layer using the Python/C API. In this paper, we carry out an empirical study of the Python/C API on evolution and bug patterns. The evolution analysis includes Python/C API design in CPython compilers and its usage in mainstream software. By designing and applying a static analysis toolset, we reveal the evolution and usage statistics of the Python/C API and provide a summary and classification of 9 common bug patterns. In Pillow, a widely used Python imaging library, we find 48 bugs, 19 of which are undiscovered before. Our toolset can be easily extended to access different types of syntactic bug‐finding checkers, and our systematical taxonomy to classify bugs can guide the construction of more highly automated and high‐precision bug‐finding tools.
Read full abstract