Hundreds of different mobile devices are on the market, produced by different vendors, and with different software features and hardware components. Mobile applications, while running on different devices, may behave differently due to variations in the hardware or O.S. components. Since mobile applications are expected to be deployed and executed on diverse mobile platforms, they must be validated on different mobile platforms and devices. Due to the peculiarities of mobile application development, there is a need for a quality assurance approach that focuses on its challenges. Moreover, mobile test executions take a long time because all the tests were executed in different environments and developers had to create complex tear down procedures. Such procedures were lengthy and far from perfect, leading to unpredictable failures. Regression testing is a crucial part of Mobile app development and it checks that software changes do not break existing functionality. An important assumption of regression testing is that test outcomes are deterministic and tests are expected to either always pass or always fail for the same code under test. But unfortunately, in real projects multiple release cycles, some tests— often called flaky tests—have non-deterministic outcomes. These tests undermine the regression testing cycle as they make it difficult to rely on test results. These results significantly reduced the trust in the tests and thus undermined the whole mobile app test automation effort. We trained machine learning classifiers separately on each test result dataset and compared performance across datasets. The proposed model predicts result types as Non-Deterministic or Deterministic tests from the regression suite results executed in various release cycles.
Read full abstract