Purpose: To develop a method for individual parameter estimation of four hydraulic-analogy bioenergetic models and to assess the validity and reliability of these models’ prediction of aerobic and anaerobic metabolic utilization during sprint roller-skiing.Methods: Eleven elite cross-country skiers performed two treadmill roller-skiing time trials on a course consisting of three flat sections interspersed by two uphill sections. Aerobic and anaerobic metabolic rate contributions, external power output, and gross efficiency were determined. Two versions each (fixed or free maximal aerobic metabolic rate) of a two-tank hydraulic-analogy bioenergetic model (2TM-fixed and 2TM-free) and a more complex three-tank model (3TM-fixed and 3TM-free) were programmed into MATLAB. The aerobic metabolic rate (MRae) and the accumulated anaerobic energy expenditure (Ean,acc) from the first time trial (STT1) together with a gray-box model in MATLAB, were used to estimate the bioenergetic model parameters. Validity was assessed by simulation of each bioenergetic model using the estimated parameters from STT1 and the total metabolic rate (MRtot) in the second time trial (STT2).Results: The validity and reliability of the parameter estimation method based on STT1 revealed valid and reliable overall results for all the four models vs. measurement data with the 2TM-free model being the most valid. Mean differences in model-vs.-measured MRae ranged between -0.005 and 0.016 kW with typical errors between 0.002 and 0.009 kW. Mean differences in Ean,acc at STT termination ranged between −4.3 and 0.5 kJ and typical errors were between 0.6 and 2.1 kJ. The root mean square error (RMSE) for 2TM-free on the instantaneous STT1 data was 0.05 kW for MRae and 0.61 kJ for Ean,acc, which was lower than the other three models (all P < 0.05). Compared to the results in STT1, the validity and reliability of each individually adapted bioenergetic model was worse during STT2 with models underpredicting MRae and overpredicting Ean,acc vs. measurement data (all P < 0.05). Moreover, the 2TM-free had the lowest RMSEs during STT2.Conclusion: The 2TM-free provided the highest validity and reliability in MRae and Ean,acc for both the parameter estimation in STT1 and the model validity and reliability evaluation in the succeeding STT2.