The application of automated fault detection and diagnosis algorithms has proven to be an effective way to increase the energy efficiency of buildings. While data-driven strategies have demonstrated their potential in developing such algorithms, they require a set of historical data that represents both healthy and faulty equipment operation, which is not a common scenario in real facilities. This article presents a methodology for the generation of synthetic data on both healthy and faulty operation for building heating and air conditioning equipment, which is especially useful for supervision and failure detection in the case of replication and expansion of buildings in tertiary or residential building planning programmes. This work investigates the automatic calibration of a typical air-handling unit hybrid model, composed of the detailed parametric representation of the system components and its control sequence coupled with a simplified thermal load. The model is then extended with fault models that resemble commonly encountered component faults to generate operation data in faulty scenarios. The proposed framework is validated in a real use case with the calibration of an air-handling unit system and its control sequence using 15 days of data gathered from the building management system. The results obtained from the injection of seven typical faults are then compared with the fault-free simulations to highlight and validate their impact on key operating variables. The results prove the capabilities of this methodology to support database generation in the fields of fault detection and advanced maintenance of buildings’ air conditioning systems for building replicas.