The Joint AAPM-ESTRO TG-360 is developing a quantitative framework to evaluate treatment verification systems used for patient-specific quality assurance (PSQA). A subgroup was commissioned to determine which potential failure modes had the greatest risk to treatment quality and safety, and therefore should be evaluated as part of the PSQA verification. To create an extensive database of potential radiotherapy failure modes that should be detected by PSQA and to determine their relative importance for maximizing treatment quality. The subgroup consisted of eight physicists from seven countries, including representatives from three international quality assurance groups. We collected error reports from RO-ILS, SAFRON, AAPM TG publications, and other literature, including international audits. We focused on the subset of failure modes that impact whether the planned dose matches the dose received by the patient. We performed a failure-mode-and-effects analysis (FMEA), estimating the severity (S), occurrence (O), and detectability (D) of each failure mode. Detectability was scored assuming that PSQA was not done but other routine clinical QA was performed, which allowed us to see the importance of PSQA for detecting each specific failure mode. We analyzed the risk priority number (RPN=O*S*D), O*S, and severity rankings to determine the priority of each failure mode. We collected 394 error reports, which we categorized into 33 failure modes that underwent FMEA. Five failure modes were in the top ranks for both RPN and O*S analysis: four involving treatment planning system (TPS) commissioning and one regarding patient model errors. The highest-ranking RPN failure modes were: TPS algorithm limitations, TPS commissioning errors [multileaf collimator (MLC) modeling, output factor, percent-depth-dose/tissue-maximum-ratio (PDD/TMR), off-axis factor], and patient weight variation. The highest O*S failure modes were similar, with the addition of external patient position variation and incorrect linear accelerator isocenter and cGy/monitor units calibration. RPN and O*S analyses prioritized failure modes that impacted multiple patients with high occurrence and detectability scores, while severity analysis gave higher priority to single-patient modes with high severity scores. The highest-ranking severity modes were MLC sequence deletion, collision, and TPS isocenter incorrect. We have developed a list of failure modes critical to be detected during PSQA and ranked them in order of importance. The top failure modes emphasize the importance of utilizing a variety of treatment verification systems for PSQA, from secondary dose calculation through in-vivo dosimetry, in order to detect all possible errors. For failure modes in the top quartile, PSQA is critical. Without adequate PSQA, these errors may go undetected unless caught by an external audit. This analysis can be useful for optimizing PSQA workflows and for designing evaluations of treatment verification systems, and will be used by the Joint AAPM-ESTRO TG-360 to determine an appropriate validation strategy.