Decision Support Systems (DSSs) based on Fuzzy Logic are gaining increasing research interest in order to solve classification problems in a wide range of application fields, especially in medicine, where the chance of presenting classification results together with a clear explanation and with a measure of the associated uncertainty is highly appealing. However, designing a fuzzy system is a thorny process, requiring many steps to be accomplished, from the knowledge extraction and representation, to the inference process, until the presentation of results. Therefore, this paper proposes a general procedure for constructing rule-based fuzzy classifiers, according to the system characteristics of performance and interpretability required by the specific application, which can be used with any type of data, and is particularly useful for the medical field requirements. The proposed procedure is based on the naïve Bayes approximation, therefore, the optimization of necessary parameters is performed only once and separately for each variable, thus resulting computationally fast, while later steps of the procedure enable to calculate more complicate models and choose the best one, without any further optimization. Moreover, the choices of all degrees of freedom of the design, associated with the variables constituting the model, their fuzzy partitions, the rule base construction, and the inference process, are suggested in this paper. Some of them are motivated by general considerations regarding systems applied in the medical ambit. Some other design choices depend on the dataset and on the application. In order to provide an objective way for choosing these degrees of freedom, some parameters for defining the required trade-off between performance and interpretability are proposed here. The application of the proposed procedure is guided by showing a running example, using data of the Wisconsin Breast Cancer Dataset. For different values of the trade-off parameters, optimal interpretability, or first-rate performance, or acceptable interpretability and performance are obtained, with respect to the best fuzzy systems applied on the same dataset. Finally, the procedure is applied on a number of benchmark datasets, and outstanding results are achieved in terms of performance, with respect to the best classification methods of the state-of-the-art.
Read full abstract