There is a surge in the application of population-based metaheuristic algorithms to find the optimal feature subset from high dimensional datasets. Many of these approaches cannot properly scale especially as they are expected to maintain two opposing goals: maximizing the accuracy of classification while at the same time minimizing the number of feature subsets selected. In this study, a novel binary greater cane rat algorithm (GCRA), inspired by intelligent nocturnal behavior of the GCR which significantly affects their foraging and mating activities. They leave trails to food sources, shelters, and water as they forage, and this information is kept by the dominant. Also, they split into male and female groups during mating season is during abundant food supply and near water source. This information is modeled into and effective method for selecting the optimal feature subset from high-dimensional datasets using two different approaches. Firstly, five variants of binary GCRA are developed using one each from the family of S-shaped, V-shaped, U-shaped, Z-shaped, and quadratic transfer functions to binarize the GCRA. Secondly, the threshold which maps a variable to 0 or 1 is used to develop a variant of GCRA. The performance of the six (6) variants were evaluated using 12 datasets with different dimensionalities. The experimental results show the stability of all the proposed methods as they generally performed competitively. However, the threshold version known as BGCRA showed better performance in yielding the highest accuracy of classification on 9 of the 12 datasets utilized in the study and performed second in selecting the least number of important feature sets. It also showed superiority over other variants in yielding the least average fitness values in 11 of 12 (91.6%) of the datasets used. Hence, the BGCRA was utilized for further comparative analysis against 5 other popular feature selection (FS) algorithms with outstanding performance in terms of producing the highest mean accuracy of classification on 91.6% (11 of 12) of the datasets, 100% least average fitness values, and 91.6% in selecting the least average number of significant features. The results were also validated by statistical tests which showed that the BGCRA is significantly superior compared to other methods.