In general, a fuzzy neural network (FNN) is characterized by its learning algorithm and its linguistic knowledge representation. However, it does not necessarily interact with its environment when the training data is assumed to be an accurate description of the environment under consideration. In interactive problems, it would be more appropriate for an agent to learn from its own experience through interactions with the environment, i.e., reinforcement learning. In this paper, three clustering algorithms are developed based on the reinforcement learning paradigm. This allows a more accurate description of the clusters as the clustering process is influenced by the reinforcement signal. They are the REINFORCE clustering technique I (RCT-I), the REINFORCE clustering technique II (RCT-II), and the episodic REINFORCE clustering technique (ERCT). The integrations of the RCT-I, the RCT-II, and the ERCT within the pseudo-outer product truth value restriction (POPTVR), which is a fuzzy neural network integrated with the truth restriction value (TVR) inference scheme in its five layered feedforward neural network, form the RPOPTVR-I, the RPOPTVR-II, and the ERPOPTVR, respectively. The Iris, Phoneme, and Spiral data sets are used for benchmarking. For both Iris and Phoneme data, the RPOPTVR is able to yield better classification results which are higher than the original POPTVR and the modified POPTVR over the three test trials. For the Spiral data set, the RPOPTVR-II is able to outperform the others by at least a margin of 5.8% over multiple test trials. The three reinforcement-based clustering techniques applied to the POPTVR network are able to exhibit the trial-and-error search characteristic that yields higher qualitative performance.
Read full abstract