Abstract

The Takagi-Sugeno (TS) fuzzy rule system is a widely used data mining technique, and is of particular use in the identification of non-linear interactions between variables. However the number of rules increases dramatically when applied to high dimensional data sets (the curse of dimensionality). Few robust methods are available to identify important rules while removing redundant ones, and this results in limited applicability in fields such as epidemiology or bioinformatics where the interaction of many variables must be considered. Here, we develop a new parsimonious TS rule system. We propose three statistics: R, L, and ω-values, to rank the importance of each TS rule, and a forward selection procedure to construct a final model. We use our method to predict how key components of childhood deprivation combine to influence educational achievement outcome. We show that a parsimonious TS model can be constructed, based on a small subset of rules, that provides an accurate description of the relationship between deprivation indices and educational outcomes. The selected rules shed light on the synergistic relationships between the variables, and reveal that the effect of targeting specific domains of deprivation is crucially dependent on the state of the other domains. Policy decisions need to incorporate these interactions, and deprivation indices should not be considered in isolation. The TS rule system provides a basis for such decision making, and has wide applicability for the identification of non-linear interactions in complex biomedical data.

Highlights

  • In the use of health informatics, one way to support public services planners in making decisions under uncertainty is to provide decision models that are robust and have excellent predictive performance

  • We propose a new index for ranking TS fuzzy rules by considering the contribution of the local linear models (LLMs), termed as L-values for TS rules

  • Uncertainty emerges for the areas whose deprivation scores lie between the cut-offs, and different degrees of high/ low membership are subsequently taken into account by the weights of the fuzzy rules

Read more

Summary

Introduction

In the use of health informatics, one way to support public services planners in making decisions under uncertainty is to provide decision models that are robust and have excellent predictive performance. Fuzzy logic has become one of the cornerstones for characterising uncertainty in system modelling and data mining [2,3,4,5]. The first is its representative power, being able to describe a highly nonlinear system with simple local linear models (LLMs). The overall system output is obtained by fusing these subsystems. In this manner, an interaction between variables, whereby the effect on an output measure of a given level of a variable is dependent on the level of one or more other covariates, is revealed. The interaction will be represented by notably different output rules at different combinations of variable levels (regions of the data space)

Methods
Results
Conclusion
Full Text
Paper version not known

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.