Introduction With the growing use of complex treatments such as VMAT, physics workload has increased in regards to pre-treatment patient specific QA. In this work, we evaluated the ability of 8 RTPLAN-based metrics to predict robust treatment delivery. Methods Ninety-three VMAT plans were delivered on the ArcCHECK system using global 2 %/2 mm (90 % passing points) and 3 %/3 mm (95 %) gamma index criteria. Using patient’s DICOM RTPLAN and Python scripting, we calculated the following aperture-based metrics from Crowe et al. [1] : MFA (Mean Field Area), MAD (mean Aperture Displacement), SAS (Small Aperture Score: threshold 10 mm), CLS (Closed Leaf Score), CAS (Cross Axis Score), and from Mc Niven et al. [2] the MCS (Modulation Complexity Score) deliverability metric. In addition, we developed a novel metric based on TPS second dose calculation using a different leaf offset modelling: LOIC(PTV) (Leaf Offset Impact on Calculation) is defined as the percentage variation of PTV mean dose with respect to the leaf offset parameter in the model (from 0.5 to 0 mm) and aims to quantify the “MU-weighted global narrowness” of MLC aperture. Finally the total MU per Gy delivered was tested. Correlation between gamma passing rates (GPR) and metrics values was assessed using Pearson’s r-coefficient. Receiver-operating characteristic (ROC) analysis was performed to determine appropriate complexity threshold values above which a plan should be considered either for re-optimization (high specificity) or exempt from QA measurements (100 % sensitivity). Results Out of 93 plans, 77 and 41 passed the 3%/3 mm and 2%/2 mm gamma criteria. Table 1 shows absolute Pearson’s r coefficients, associated p-value and ROC Area Under the Curve (AUC) for the 8 metrics and 2 gamma criteria. Download : Download high-res image (374KB) Download : Download full-size image A strong correlation (p 0.001) was observed between GPR and LOIC, CAS, MCS, SAS and MU/Gy. The highest Pearson’s r value was obtained for LOIC (0.66 and 0.69). ROC curves showed the best results for LOIC versus GPR 3%/3 mm, with AUC of 0.92 (Fig. 1.). Download : Download high-res image (149KB) Download : Download full-size image A LOIC threshold of 1.7 % allowed for the identification of robust delivery with a false positive rate of 6.5 % and a true positive rate of 69 %, which makes re-planning a relevant option. On the other hand, a LOIC threshold of 1.25 % provided no false negative (full sensitivity), allowing for a workload reduction of 49 %. Conclusions From the 8 metrics evaluated, LOIC was the most powerful tool in order to identify sparsely/overly modulated plans before time-consuming QA measurements are performed, allowing to halve the patient QA workload and to improve plan accuracy/deliverability.