The fidelity of the decadal experiment in Coupled Model Intercomparison Project Phase-5 (CMIP5) has been examined, over different climate variables for multiple temporal and spatial scales, in many previous studies. However, most of the studies were for the temperature and temperature-based climate indices. A quite limited study was conducted on precipitation of decadal experiment, and no attention was paid to the catchment level. This study evaluates the performances of eight GCMs (MIROC4h, EC-EARTH, MRI-CGCM3, MPI-ESM-MR, MPI-ESM-LR, MIROC5, CMCC-CM, and CanCM4) for the monthly hindcast precipitation of decadal experiment over the Brisbane River catchment in Queensland, Australia. First, the GCMs datasets were spatially interpolated onto a spatial resolution of 0.05 × 0.05° (5 × 5 km) matching with the grids of observed data and then were cut for the catchment. Next, model outputs were evaluated for temporal skills, dry and wet periods, and total precipitation (over time and space) based on the observed values. Skill test results revealed that model performances varied over the initialization years and showed comparatively higher scores from the initialization year 1990 and onward. Models with finer spatial resolutions showed comparatively better performances as opposed to the models of coarse spatial resolutions, where MIROC4h outperformed followed by EC-EARTH and MRI-CGCM3. Based on the performances, models were grouped into three categories, where models (MIROC4h, EC-EARTH, and MRI-CGCM3) with high performances fell in the first category, and middle (MPI-ESM-LR and MPI-ESM-MR) and comparatively low-performing models (MIROC5, CanCM4, and CMCC-CM) fell in the second and third categories, respectively. To compare the performances of multi-model ensembles’ mean (MMEMs), three MMEMs were formed. The arithmetic mean of the first category formed MMEM1, the second and third categories formed MMEM2, and all eight models formed MMEM3. The performances of MMEMs were also assessed using the same skill tests, and MMEM2 performed best, which suggests that evaluation of models’ performances is highly important before the formation of MMEM.