Automated treatment planning strategies are being widely implemented in clinical routine to reduce inter-planner variability, speed up the optimization process, and improve plan quality. This study aims to evaluate the feasibility and quality of intensity-modulated proton therapy (IMPT) plans generated with four different knowledge-based planning (KBP) pipelines fully integrated into a commercial treatment planning system (TPS). A data set containing 60 oropharyngeal cancer patients was split into 11 folds, each containing 47 patients for training, 5 patients for validation and 5 patients for testing. A dose prediction model was trained on each of the folds, resulting in a total of 11 models. Three patients were left out in order to assess if the differences introduced between models were significant. From voxel-based dose predictions, we analyze the two steps that follow the dose prediction: post-processing of the predicted dose and dose mimicking (DM). We focused on the effect of post-processing (PP) or no post-processing (NPP) combined with two different DM algorithms for optimization: the one available in the commercial TPS RayStation (RSM) and a simpler isodose-based mimicking (IBM). Using 55 test patients (5 test patients for each model), we evaluated the quality and robustness of the plans generated by the four proposed KBP pipelines (PP-RSM, PP-IBM, NPP-RSM, NPP-IBM). After robust evaluation, dose-volume histogram (DVH) metrics in nominal and worst-case scenarios were compared to those of the manually generated plans. Nominal doses from the four KBP pipelines showed promising results achieving comparable target coverage and improved dose to organs at risk (OARs) compared to the manual plans. However, too optimistic post-processing applied to the dose prediction (i.e. important decrease of the dose to the organs) compromised the robustness of the plans. Even though RSM seemed to partially compensate for the lack of robustness in the PP plans, still 65% of the patients did not achieve the expected robustness levels. NPP-RSM plans seemed to achieve the best trade-off between robustness and OAR sparing. PP and DM strategies are crucial steps to generate acceptable robust and deliverable IMPT plans from ML-predicted doses. Before the clinical implementation of any KBP pipeline, the PP and DM parameters predefined by the commercial TPS need to be modified accordingly with a comprehensive feedback loop in which the robustness of the final dose calculations is evaluated. With the right choice of PP and DM parameters, KBP strategies have the potential to generate IMPT plans within clinically acceptable levels comparable to plans manually generated by dosimetrists. This article is protected by copyright. All rights reserved.