ABSTRACTDeployment of process analytical technology tools such as Raman or IR spectroscopy and associated multivariate calibration models for process monitoring and control plays an important role in process automation and advanced manufacturing of pharmaceuticals. Preprocessing or preparation of the spectroscopic data is an important step in developing a multivariate calibration model. There are several ways available to preprocess the data and each may influence the calibration model performance differently. Here we investigated the influence of preprocessing procedures on the development and performance of the chemometric models to predict the glucose concentration in a bioreactor. Box–Behnken design of experiment (DOE) was used to generate the Raman spectroscopy data. Four factors were considered critical in the DOE—glucose, glutamine, glutamic acid, and antifoam concentration. Raman spectroscopy data were collected both with and without aeration conditions, independently from three cell culture media. For each medium, data consisted of calibration set (27 conditions) and model validation set (9 conditions) separately. Additionally, Raman data was also collected for certain DOE runs with increasing concentration of cell densities ranging from 0.5 × 10 E06/mL to 30 × 10 E06/mL under aerating conditions. Data from the three cell culture media were used separately to develop calibration models that used four different preprocessing procedures, namely, baseline correction (BLC), Savitzky–Golay smoothing (SGS), Savitzky–Golay derivative (SGD) and orthogonal signal correction (OSC). The preprocessing procedures were applied individually and in combinations to evaluate the calibration model parameters and the performance metrics. We further developed glucose calibration models based on partial least squares (PLS) regression with 1–3 principal components. The models developed with OSC procedure gave superior performance metrics with just one principal component across all three media. Models developed with other preprocessing procedures required two or more principal components to give comparable performance. Overall, the choice of preprocessing procedures affected the model performance.
Read full abstract