This paper presents a model use the modulation and energy components for speaker recognition application, that is mainly follows theshort-term scenario in speech signal processing, and also introduce a parameter combination that includes the instantaneous components and the energy parameters. This will describe the importance of short-term speech analysis in estimating the modulation parameters and the role of the instantaneous energy in estimating the speaker-dependent parameters. Simply, the short-term scenario is used to, first; avoid the silent and background noise speech portions that present in speech signals, and also to benefitfrom the stationary concept of the short-term processing in the speech signal. The energy components, on the other hand, are adopted purely in many speech parameterisation models, such as, linear predictive coding (LPC) and Mel-frequency cepstral coefficients (MFCCs). The main ideal of our mixture parameter (or MFCC/AM-FM model) is to determined the extent of these components to contribute together in extracting the parameters that are more related to the speaker more than anything else presented in the speech signal. We evaluated both models using the text-dependent and text-independent speech corpora. The accuracy results show that the frame-based AM-FM model achieve better performance comparing with the traditional structure of the AM-FM modulation model(the model presented in [1]). The MFCC/AM-FM parameters, on the other hand, perform much better, in terms of text-dependent, comparing with the AM-FM parameters and the MFCC parameters. In the case of the text-independent, however, the MFCC/AM-FM model provide better results than the MFCC features but less performance comparing to the AM-FM modulation parameters.