The excessive modulation of treatment plan during radiotherapy (RT) increases the complexity. Evaluation of the multidimensional relationship between program complexity metrics, computation-based patient-specific quality assurance (PSQA), and conventional measurement-based PSQA could assist in enhancing the robustness of treatment planning, guide the allocation of clinical QA resources, and ultimately lessen QA workload. The fifty-five metrics affecting RT planning and delivery accuracy were calculated by a house-built program to describe the complexity of 404 dynamic IMRT plans, with sensitivity to the small field, aperture position, MLC edge, low MUs, MLC leaf motion, leaf speed/acceleration, etc. The calculation-based PSQA was performed using Monte Carlo (MC) method and Collapsed Cone Convolution (CCC) algorithm, implemented in SciMoCa and Mobius 3D, respectively. The measurement-based PSQA was performed using 3D diode arrays with different geometries covering "O", "+" and " × " shapes which exist in ArcCheck, Delta4 phantom+ (Delta4) and Delta4PT phantom (Delta4PT), respectively. Gamma passing rates (GPRs) were recorded to measure the results of each QA system. This multidimensional relationship was evaluated using correlation analysis and principal component linear regression (PCR) analysis. A total of 4448 GPRs for various QA systems corresponding to two Linacs were counted. The modulation index for speed (MIs) and modulation index for acceleration (MIa) were consistently located at the high points of the radarplots of the Spearman correlation coefficient |rs| between metrics and GPRs of the four QA systems, just except Delta4. Besides, the rs between SciMoCa and ArcCheck were 0.275-0.531 (P ≤ 0.001), SciMoCa and Delta4 were 0.32-0.418 (P ≤ 0.001), and Mobius 3Dand Delta4PT were 0.124-0.226 (P ≤ 0.05). The PCR model's coefficients determination (R2) for SciMoCa were 0.461-0.756 (P ≤ 0.001), ArcCheck were 0.243-0.440 (P ≤ 0.001), Delta4 were 0.268-0.402 (P ≤ 0.001), Mobius 3D were 0.299-0.407 (P ≤ 0.001), and Delta4PT were 0.087-0.141 (P ≤ 0.05). This study is the first overall assessment of the impact of various complexity metrics on the accuracy of TPS calculation and Linac delivery. Of the metrics studied, MIs and MIa metrics have a standout impact on the ability of the TPS calculation and delivery system, extra attention should be paid during the planning process. It is inappropriate to utilize calculation-based QA to predict the results of measurement-based QA since there is a poor correlation between the two. Furthermore, calculation-based QA outperforms measurement-based QA in identifying highly complex plans, which can further guide clinical QA process optimization and save limited clinical resources.