In virtualizing engineered systems, it is essential to come up with simulators that are essentially capable of representing the system in its “as-deployed” state. Any attempt to this end may only be approximate given the inherent uncertainties present in the loadings and operational conditions of the system, as well as the configuration of the system itself (geometry, materials, control systems, boundary conditions, etc.). This is especially true for complex systems, such as wind turbines, where often a number of assumptions govern the setup of the engineering models. Such models are often made available at different granularities with each one offering a diversified level of precision depending on the quantity of interest (e.g. macroscopic displacements or microscopic strains) and the properties of the acting loads (e.g. amplitude and frequency content). This implies that the predictive capabilities are severely hampered when a single so-deemed best model is chosen for simulation. Building on this idea, we here present a method for fusing the outputs from multiple simulators (e.g. aero-servo-hydro-elastic simulators) for estimating a quantity of interest (QoI) with higher precision. The proposed ensemble learning approach comprises two main building blocks. Firstly, a clustering step by means of a Variational Bayesian Gaussian mixture model, employed for the weighing of each available simulator. Clustering is performed on the basis of the binned input space, which allows for extraction of a probability map for each local region of the binned input space. This delivers an adaptive scheme, which allows different simulators to more or less prominently contribute to the prediction of the QoI, depending on the range of the input parameters. Local weighted Bootstrap Aggregation is then executed in a second step for combining the clustered ensemble of outputs from the individual simulators. A simulated toy example and a wind turbine blade fatigue case study are herein exploited to demonstrate the efficacy of the suggested ensemble learning scheme. The approach is compared against alternatives typically adopted in existing literature, such as Stacking, classical Bagging, and Bayesian Model Averaging. The results confirm an improvement in predictive capabilities as expressed via the reduction in the generalization error and the narrowing of the associated confidence interval.