Deep learning (DL) based percussion-acoustic methods have gained attention, but their multi-layer architectures and iterative processes increase computational time and power. This paper proposes a lightweight concrete-filled steel tubular (CFST) void detection method using Mel-frequency cepstral coefficient (MFCC) algorithm and ensemble machine learning. A global averaging pooling (GAP) layer was applied to downscale the two-dimensional MFCC matrix to one-dimensional (1D) time-frequency parameters. Combined with ensemble machine learning, lightweight models that require fewer model parameters, less computation, and smaller memory usage are constructed. Validation experiments were carried out on a CFST specimen with different depths of voids (0, 30, 50, and 80 mm). Robustness tests, considering tap intervals, sampling frequencies, and background noise, were performed by using data augmentation strategies. Two studies investigate feature extraction methods: the selection of the time-frequency analysis algorithm (MFCC or short-time Fourier transform) and the comparison of dimensionality reduction operation (average, maximum, and minimum pooling layers). Comparative analyses with a 1D dilated convolutional neural network (1D dilated CNN) reveal that the MFCC-based ensemble machine learning achieves excellent test accuracy on the raw signal dataset and better generalization capability on the reconstructed signal dataset. Moreover, the ensemble machine learning model exhibits significantly higher computational efficiency compared to DL models. Specifically, the training time of random forest is 17,510 times faster than the 1D dilated CNN. Therefore, the proposed lightweight CFST void detection method, incorporating 1D MFCC parameters and ensemble machine learning, achieves high model accuracy with low computational cost and holds promising potential for practical CFST applications.