The voice signal carries a wide range of data about the speaker, including their physical characteristics, feelings, and level of health. There are several uses for the estimate of these physical characteristics from the speech in forensics, security, surveillance, marketing, and customer service. The primary goal of this research is to identify the auditory characteristics that aid in estimating a speaker’s age. To this end, an ensemble feature selection model is proposed that selects the best features from a baseline acoustic feature vector for age estimation from speech. Using a feature vector that covers various spectral, temporal, and prosodic aspects of speech, an ensemble-based automatic feature selection is performed by, first calculating the feature importance or ranks based on individual feature selection methods, then voting is applied to the resulting feature ranks to attain the top-ranked subset by all feature selection methods. The proposed method is evaluated on the TIMIT dataset and achieved a mean absolute error (MAE) of 5.58 years and 5.12 years for male and female age estimation. Index Terms— Age Estimation, Feature Selection, Ensemble Selection, TIMIT dataset.