As a typical multigranularity data analysis model, multi-scale rough sets have attracted considerable attention in recent years. However, classical multi-scale rough sets and most of its extended models can only deal with discrete data, which limits its popularization and application. To overcome this problem, we investigate the fuzzy generalization of multi-scale rough sets as well as their application in feature selection for continuous data. To this end, a new type of decision systems, i.e., multi-scale fuzzy decision systems, is formalized to represent knowledge at different scales. Scaled fuzzy granules in terms of a family of scaled fuzzy relations are introduced, using which the granular structures of fuzzy lower and upper approximations are presented. A heuristic lattice-based optimal scale selection algorithm is then put forward from the viewpoint of maintaining the consistency of decision systems. Decision rules with strong generalization ability can be obtained by selecting appropriate scales. Finally, a forward feature selection algorithm was developed by means of the optimal scale to reduce redundant fuzzy relations. Extensive numerical experiments are further conducted to compare the proposed algorithm with some state-of-the-art algorithms. The experimental results show that our model can improve the generalization ability of fuzzy rough set, so as to be more feasible and effective.