Fuzzy rule-based models have been extensively used in regression problems. Besides high accuracy, one of the most appreciated characteristics of these models is their interpretability, which is generally measured in terms of complexity. Complexity is affected by the number of features used for generating the model: the lower the number of features, the lower the complexity. Feature selection can therefore considerably contribute not only to speed up the learning process, but also to improve the interpretability of the final model. Nevertheless, a very few methods for selecting features before learning regression models have been proposed in the literature. In this paper, we focus on these methods, which perform feature selection as pre-processing step. In particular, we have adapted two state-of-the-art feature selection algorithms, namely NMIFS and CFS, originally proposed for classification, to deal with regression. Further, we have proposed FMIFS, a novel forward sequential feature selection approach, based on the minimal-redundancy-maximal-relevance criterion, which can manage directly fuzzy partitions. The relevance and the redundancy of a feature are measured in terms of, respectively, the fuzzy mutual information between the feature and the output variable, and the average fuzzy mutual information between the feature and the just selected features. The stopping criterion for the sequential selection is based on the average values of relevance and redundancy of the just selected features.We have performed two experiments on twenty regression datasets. In the first experiment, we aimed to show the effectiveness of feature selection in fuzzy rule-based regression model generation by comparing the mean square errors achieved by the fuzzy rule-based models generated using all the features, and the features selected by FMIFS, NMIFS and CFS. In order to avoid possible biases related to the specific algorithm, we adopted the well-known Wang and Mendel algorithm for generating the fuzzy rule-based models. We present that the mean square errors obtained by models generated by using the features selected by FMIFS are on average similar to the values achieved by using all the features and lower than the ones obtained by employing the subset of features selected by NMIFS and CFS. In the second experiment, we intended to evaluate how feature selection can reduce the convergence time of the evolutionary fuzzy systems, which are probably the most effective fuzzy techniques for tackling regression problems. By using a state-of-the-art multi-objective evolutionary fuzzy system based on rule learning and membership function tuning, we show that the number of evaluations can be considerably reduced when pre-processing the dataset by feature selection.