The ability to predict whether a specific section of a spreadsheet is faulty or not is frequently required for the development of spreadsheet functionality. Although errors in such spreadsheets are common and can have serious consequences, today’s spreadsheet creation and management tools offer weak capabilities for defect detection, localization, and fixing. In this thesis, we proposed a method for predicting faults in spreadsheet formulas that can detect faults in non-formula cells by combining a catalog of spreadsheet metrics with modern machine learning algorithms. An examination of the individual metrics in the catalog reveals that they are suited to detecting data where a formula is expected to have flaws. In this framework, Recall Score of 99% was achieved and performance was compared with that of Melford. The result of the experiment reveals that the proposed framework outperforms Melford framework.
Read full abstract