AbstractThis work presents an automatic tonal/nontonal preclassification‐based Indian language identification (LID) system. Languages are firstly classified into tonal and nontonal categories, and then, individual languages are identified from the languages of the respective categories. This work proposes the use of pitch Chroma and formant features for this task, and also investigates how Mel‐frequency Cepstral Coefficients (MFCCs) complement these features. It further explores block processing (BP), pitch synchronous analysis (PSA)‐ and glottal closure regions (GCRs)‐based approaches for feature extraction, using syllables as basic units. Cascade convolutional neural network (CNN)‐long short‐term memory (LSTM) model using syllable‐level features has been developed. National Institute of Technology Silchar language database (NITS‐LD) and OGI‐Multilingual Telephone Speech Corpus (OGI‐MLTS) have been used for experimental validation. The proposed system based on the score combination of Cascade CNN‐LSTM models of Chroma (extracted from BP method), first two formants and MFCCs (both extracted from GCR method) reports the highest accuracies. In the preclassification stage, the observed accuracies are 91%, 87.3%, and 85.1% for NITS‐LD, for 30 s, 10 s, and 3 s test data respectively. For OGI‐MLTS database, the respective accuracies are 86.7%, 83.1%, and 80.6%. That amounts to absolute improvements of 11.6%, 12.3%, and 13.9% for NITS‐LD, and 12.5%, 11.9%, and 12.6% for OGI‐MLTS database with respect to that of the baseline system. The proposed preclassification‐based LID system shows improvements of 7.3%, 6.4%, and 7.4% for NITS‐LD and 6.1%, 6.7%, and 7.2% for OGI‐MLTS database over the baseline system for the three respective test data conditions.