A myriad of early warning scores (EWSs) exist, yet there is a need to identify the most clinically valid score to be used in prehospital respiratory assessments to estimate short-term and midterm mortality, intensive-care unit admission, and airway management in life-threatening acute respiratory distress. This is a prospective, observational, multicentre, ambulance-based, external validation study performed in 44 ambulance services and four hospitals across three Spanish provinces (ie, Salamanca, Segovia, and Valladolid). We identified adults (ie, those aged 18 years and older) discharged to the emergency department with suspected acute respiratory distress. The primary outcome was 2-day all-cause in-hospital mortality, for all the patients or according to prehospital respiratory conditions, including dyspnoea, chronic obstructive pulmonary disease (COPD), COVID-19, other infections, and other conditions (asthma exacerbation, haemoptysis, and bronchoaspirations). 30-day mortality, intensive-care unit admission, and invasive and non-invasive mechanical ventilation were secondary outcomes. Eight EWSs, namely, the National Early Warning Score 2, the Modified Rapid Emergency Medicine Score, the Rapid Acute Physiology Score, the Quick Sequential Organ Failure Assessment Score, the CURB-65 Severity Score for Community-Acquired Pneumonia, the BAP-65 Score for Acute Exacerbation of COPD, the Quick COVID-19 Severity Index, and the Modified Sequential Organ Failure Assessment (mSOFA), were explored to determine their predictive validity through calibration, clinical net benefit as determined through decision curve analysis, and discrimination analysis (area under the curve of the receiver operating characteristic [AUROC], compared with Delong's test). Between Jan 1, 2020, and Nov 31, 2022, 902 patients were enrolled. The global 2-day mortality rate was 87 (10%); in proportion to various respiratory conditions, the rates were 35 (40%) for dyspnoea, nine (10%) for COPD, 13 (15%) for COVID-19, 28 (32%) for other infections, and two (2%) for others conditions. mSOFA showed the best calibration, a higher net benefit, and the best discrimination (AUROC 0·911, 95% CI 0·86-0·95) for predicting 2-day mortality, and its discrimination was statistically significantly more accurate (p<0·0001) compared with the other scores. The performance of mSOFA for predicting 2-day mortality was higher than the other scores when considering the prehospital respiratory conditions, and was also higher for the secondary outcomes, except for non-invasive mechanical ventilation. Our results showed that mSOFA outperformed other EWSs. The inclusion of mSOFA in prehospital decision making will entail a quick identification of patients in acute respiratory distress at high risk of deterioration, allowing prioritisation of resources and patient care. Gerencia Regional de Salud, Public Health System of Castilla y León (GRS Spain). For the Spanish translation of the abstract see Supplementary Materials section.