Background and purposeDiagnosis of depression is based on tests performed by psychiatrists and information provided by patients or their relatives. In the field of machine learning (ML), numerous models have been devised to detect depression automatically through the analysis of speech audio signals. While deep learning approaches often achieve superior classification accuracy, they are notably resource-intensive. This research introduces an innovative, multilevel hybrid feature extraction-based classification model, specifically designed for depression detection, which exhibits reduced time complexity. Materials and methodsMODMA dataset consisting of 29 healthy and 23 Major depressive disorder audio signals was used. The constructed model architecture integrates multilevel hybrid feature extraction, iterative feature selection, and classification processes. During the Hybrid Handcrafted Feature (HHF) generation stage, a combination of textural and statistical methods was employed to extract low-level features from speech audio signals. To enhance this process for high-level feature creation, a Multilevel Discrete Wavelet Transform (MDWT) was applied. This technique produced wavelet subbands, which were then input into the hybrid feature extractor, enabling the extraction of both high and low-level features. For the selection of the most pertinent features from these extracted vectors, Iterative Neighborhood Component Analysis (INCA) was utilized. Finally, in the classification phase, a one-dimensional nearest neighbor classifier, augmented with ten-fold cross-validation, was implemented to achieve detailed, results. ResultsThe HHF-based speech audio signal classification model attained excellent performance, with the 94.63 % classification accuracy. ConclusionsThe findings validate the remarkable proficiency of the introduced HHF-based model in depression classification, underscoring its computational efficiency.