This paper documents the performance of the organic soil version of the Canadian Land Surface Scheme (CLASS) in modelling the hydrology and energy balance of the Beverly Swamp, Southern Ontario. The hydrometeorological dataset used to assess model performance begins in the autumn of 1983 and spans 33 months, presenting the first multi‐year characterization of the area. The Beverly Swamp receives approximately 900 mm of precipitation per year, of which one third is lost to net runoff, and the remainder to evaporation. Vertical drainage at this site is impeded, due to the presence of a marl layer below the highly decomposed peat soil, at approximately 1‐m depth. This mixed‐forest wetland is unique among surfaces used for CLASS testing to date. Within CLASS, vertical drainage at the bottom of the soil profile is set to zero to represent the marl subsurface boundary. Preliminary runs have shown that after each melt period this produced ponded water on site which persisted from year to year. The inclusion of a simple lateral drainage function in CLASS simulated actual measured lateral surface flow, and effectively reproduced seasonal differences in water table position. Comparisons between measured and modelled diurnally averaged energy budget components taken from two summers indicate that there is a marked tendency for CLASS to underestimate latent heat flux (QE) by 29% of the observed values, the major cause of this disagreement being due to systematic error. Concurrent with this error is an overestimation of the magnitude of soil heat storage (QG), by a factor of seven, wherein the error is dominantly systematic. Modifications made to the canopy resistance parametrization, based on site measurements, resulted in improved model estimates of QE, reducing the underestimation to 12% of observed values, and changing the major cause of error from systematic to unsystematic in nature. The improvement in QE corresponded with a change in the prediction of sensible heat flux (QH). A tendency to overestimate QH by 20% of the observed values changed to an underestimation of QH by 14%, the error being unsystematic in each case. The modifications resulted in no significant change to either the magnitude or the nature of the error for QG. Modelled daily average temperatures for the third soil layer versus temperature measured at 1‐m depth (the centre of the layer) indicated that modelled values had more extreme minima and maxima, although some of this discrepancy could be attributable to the heterogeneous nature of the soil column, and the unavoidable use of point versus layer average temperatures. Discrepancies also exist between measured and modelled snow mass duration and the timing of melt for three consecutive winters. This suggests that further tests of CLASS, using winter season data, must be conducted before it can be determined if the model is able to correctly simulate snow accumulation and melt. Wintertime total albedo at this site was also poorly modelled during the fall and winter periods. Further test runs determined that this overestimation in total albedo was not contributing significantly to the lower modelled soil temperatures or to the persistence of the winter snowpack. The correspondence between modelled and observed data, particularly given the complexity of the canopy and surface at this study site, is adequate but suggests that further code testing and development initiatives should be directed towards improving the simulation of latent and soil heat fluxes, shortwave reflectivity, winter snowpack dynamics and surface and subsurface moisture transfers, which are especially important in wetland environments.