Abstract

Modern web applications offer various APIs for data interaction. However, as the number of these APIs increases, so does the potential for security threats. Essentially, more APIs in an application can lead to more detectable vulnerabilities. Thus, it is crucial to identify APIs as comprehensively as possible in web applications. However, this task faces challenges due to the increasing complexity of web development techniques and the abundance of similar web pages. In this paper, we propose APIMiner, a framework for identifying APIs in web applications by dynamically traversing web pages based on web page state similarity analysis. APIMiner first builds a web page model based on the HTML elements of the current web page. APIMiner then uses this model to represent the state of the page. Then, APIMiner evaluates each element’s similarity in the page model and determines the page state similarity based on these similarity values. From the different states of the page, APIMiner extracts the data interaction APIs on the page. We conduct extensive experiments to evaluate APIMiner’s effectiveness. In the similarity analysis, our method surpasses state-of-the-art methods like NDD and mNDD in accurately distinguishing similar pages. We compare APIMiner with state-of-the-art tools (e.g., Enemy of the State, Crawlergo, and Wapiti3) for API identification. APIMiner excels in the number of identified APIs (average 1136) and code coverage (average 28,470). Relative to these tools, on average, APIMiner identifies 7.96 times more APIs and increases code coverage by 142.72%.

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.