The demand for rapid column screening, computer-assisted method development and method transfer, and unambiguous compound identification by LC/MS analyses has pushed analysts to adopt experimental protocols and software for the accurate prediction of the retention time in liquid chromatography (LC). This Perspective discusses the classical approaches used to predict retention times in LC over the last three decades and proposes future requirements to increase their accuracy. First, inverse methods for retention prediction are essentially applied during screening and gradient method optimization: a minimum number of experiments or design of experiments (DoE) is run to train and calibrate a model (either purely statistical or based on the principles and fundamentals of liquid chromatography) by a mere fitting process. They do not require the accurate knowledge of the true column hold-up volume V0, system dwell volume Vdwell (in gradient elution), and the retention behavior (k versus the content of strong solvent φ, temperature T, pH, and ionic strength I) of the analytes. Their relative accuracy is often excellent below a few percent. Statistical methods are expected to be the most attractive to handle very complex retention behavior such as in mixed-mode chromatography (MMC). Fundamentally correct retention models accounting for the simultaneous impact of φ, I, pH, and T in MMC are needed for method development based on chromatography principles. Second, direct methods for retention prediction are ideally suited for accurate method transfer from one column/system configuration to another: these quality by design (QbD) methods are based on the fundamentals and principles of solid-liquid adsorption and gradient chromatography. No model calibration is necessary; however, they require universal conventions for the accurate determination of true retention factors (for 1 < k < 30) as a function of the experimental variables (φ, T, pH, and I) and of the true column/system parameters (V0, Vdwell, dispersion volume, σ, and relaxation volume, τ, of the programmed gradient profile at the column inlet and gradient distortion at the column outlet). Finally, when the molecular structure of the analytes is either known or assumed, retention prediction has essentially been made on the basis of statistical approaches such as the linear solvation energy relationships (LSERs) and the quantitative structure retention relationships (QSRRs): their ability to accurately predict the retention remains limited within 10-30%. They have been combined with molecular similarity approaches (where the retention model is calibrated with compounds having structures similar to that of the targeted analytes) and artificial intelligence algorithms to further improve their accuracy below 10%. In this Perspective, it is proposed to adopt a more rigorous and fundamental approach by considering the very details of the solid-liquid adsorption process: Monte Carlo (MC) or molecular dynamics (MD) simulations are promising tools to explain and interpret retention data that are too complex to be described by either empirical or statistical retention models.
Read full abstract