A robust extension of normal theory regression is to add an extra parameter to model the kurtosis of the error distribution, for example by using the T-family or the power-exponential family of distributions. The statistical properties of maximum likelihood estimation schemes for both families of models are considered. This article extends the work of Lange et al. (1989) in which the usefulness of the multivariate T-family for modelling data was demonstrated. The cost of adding the extra kurtosis parameter to the standard normal model is considered. This cost is measured by comparing the variance of the quantity of interest when the estimation of the extra parameter is taken into account and the variance when the estimated value of the extra parameter is treated as if it were known. In Lange et al. (1989), it is shown in a general setting that asymptotically there is no cost due to the estimation of the extra parameter, where the quantity of interest is the location or regression parameters. Whether this finding remains valid in small samples for the T and power-exponential families is investigated in this paper using Monte Carlo simulations and a real dataset. The efficiency of parameter estimates and coverage rates are also considered under three different scenarios: when the extra kurtosis parameter is estimated from the data, when the extra parameter is fixed at the true value, and when it is fixed at a wrong value. The expected information matrix is used to estimate the confidence intervals, and the comparisons are based on asymptotic calculations and Monte Carlo simulations. It is found that for the T-family there is very little cost due to estimation of the extra parameter, except for small sample sizes. The inflation in variance due to the estimation of the extra parameter increases as the sample size decreases. In the Monte Carlo simulations of simple regression settings the inflation in variance is found to be at most 14% for the T-family and at most 62% for the power-exponential family. For the T-family, there is a considerable loss of efficiency in fitting a normal model when the true degrees of freedom is small, but only a small loss of efficiency if a model with a low number of degrees of freedom is fit to normal observations. In contrast, the coverage rates of confidence intervals are close to the nominal level if a normal model is fit to the data whatever the true degrees of freedom, but the coverage rates can be too low if a model with a low number of degrees of freedom is fit to normal data. The coverage rates of confidence intervals when the degrees of freedom is estimated from the data are satisfactory, except at small sample size. Similar results are obtained for the power-exponential family. In addition to the larger inflation in variance due to estimation of the extra parameter, the power-exponential family is less satisfactory then the T-family because the extra kurtosis parameter is frequently estimated to be at the boundary of its range in small samples. The findings in this article support the notion that the extra kurtosis