In crowdsourced testing, prioritizing numerous test reports is critical for improving developer review efficiency. Many researchers have proposed methods for prioritizing crowdsourced test reports for mobile applications, however, web crowdsourced test reports usually contain more detailed text and more complex screenshots, which makes prioritizing them especially necessary. This paper proposes a prioritization method named TCDiv, based on the data characteristics of web crowdsourcing test reports. First, using a pre-trained TextCNN model, the text of each report is segmented into two parts: reproduction steps and defect descriptions. Then, features are extracted from textual and image information respectively, and clustering technique is utilized to classify similar reports, and finally the clustering results are ranked and sampled. To validate the approach, experiments were conducted on 717 web crowdsourced test reports. The results show that TCDiv can detect more different defects in a limited time, thus improving the review efficiency of the development team.
Read full abstract