Batch process monitoring using principal component analysis requires sufficient historical manufacturing data to model the normal operating conditions of the process. However, when a new product is to be manufactured for the first time in a given facility, very limited historical data are available, thus entailing a small-data scenario. We thoroughly investigate and improve a data-driven methodology, previously reported in the literature (Tulsyan, Garvin & Ündey (2019). J. Process Control,77, 114–133), that enables batch process monitoring under such type of scenarios. The methodology exploits machine learning algorithms (based on Gaussian process state-space models) to generate in-silico batch trajectory data from the few available historical ones, and then uses the overall pool of real and in-silico data to build a process monitoring model. We develop automatic procedures to tune the values of several parameters of this machine-learning framework, in such a way that the generation of consistent in-silico batch trajectory data can be streamlined, thus facilitating the deployment of the framework at an industrial level. Furthermore, we develop indicators and a metric to assist the in-silico data generation activity from a process monitoring-relevant perspective. Finally, using datasets from a benchmark simulated semi-batch process for the manufacturing of penicillin, we thoroughly investigate the appropriateness of the in-silico generated data for the purpose of process monitoring.