With the ongoing efforts to collect new data for Human Reliability Analysis (HRA) (in particular, from nuclear power plant control room simulators), it becomes important that the coming data will be processed traceably, addressing its underlying variability, eventually in combination with expert judgment. In this direction, this work presents a two-stage Bayesian model to integrate expert-elicited probability estimates and empirical evidence from simulator data in the quantification of HEP values and of the associated variability distributions. The general aim is to provide a data aggregation framework able to mathematically combine diverse information sources throughout the HEP estimation process, in a systematic and reproducible way, contributing to strengthening the empirical basis of future HRA methods. The Bayesian model can be used to produce reference values and bounds for HRA methods as well as to improve the quality of plant-specific HEP estimates for use in Probabilistic Safety Assessment applications. The model is first verified with artificial data and then applied to quantify the HEP of human failure events from literature. Model sensitivity to biases in expert estimates is also investigated.