The feature selection is an important challenge in many areas of machine learning because it plays a crucial role in the interpretations of machine-driven decisions. There are various approaches to the feature selection problem and methods based on the information theory comprise an important group. Here, the minimum redundancy maximum relevance (mRMR) feature selection is undoubtedly the most popular one with widespread application. In this paper, we prove in contrast to an existing finding that the mRMR is not equivalent to Max-Dependency criterion for first-order incremental feature selection. We present another form of equivalence leading to a generalization of mRMR feature selection. Additionally, we compare several feature selection methods based on mRMR, Max-Dependency, and feature ranking, employing different measures of dependency. The results on high-dimensional real-world datasets show that the distance correlation is the suitable measure for dependency-based feature selection methods. The results also indicate that the Max-Dependency incremental algorithm combined with distance correlation appears to be a promising feature selection approach.