Abstract Background Although peripheral artery disease (PAD) is common and associated with a high risk of mortality, contemporary PAD research remains limited due to poor accuracy of diagnosis codes in the electronic health record (EHR). We previously developed and validated a novel natural language processing (NLP) algorithm that can extract ankle brachial index (ABI) and toe-brachial index (TBI) values from clinical documents and identify patients with PAD with high accuracy (positive predictive value 92.4%). Purpose Scale the NLP system for developing a nationwide registry of PAD in the Veterans Health Administration (VHA), the largest integrated health system in the U.S. that provides care to ~9 million Veterans across 130 sites. Describe the baseline characteristics and mortality outcomes among participants and identify predictors of long-term mortality. Methods A total of 1,182,025 documents from vascular and radiology procedures during 2013-2020 were processed by the NLP system. After excluding non-ABI vascular studies, patients with a known diagnosis of PAD prior to 2015, patients younger than 40 years of age, and those who do not regularly follow-up in the VHA, we identified 107,711 patients with newly diagnosed PAD during 2015-2020. Patients were followed longitudinally using the VA’s electronic health record data. The primary outcome was all-cause mortality. We used Cox proportional hazards regression to examine predictors of long-term mortality. Results The mean age was 70.5 years; 97.3% were males, and 18.5% were of Black race. The mean ABI value was 0.78 (SD: 0.26) and the mean TBI value was 0.51 (SD: 0.19). There was a high prevalence of hypertension (83.6%), heart failure (20.8%), diabetes (53.6%), renal failure (22.1%) and chronic obstructive pulmonary disease (32.4%, Table 1). At 1-year, 9.6% of patients had died and this proportion increased to 25.4% at 3 years. Although nearly all patient variables were significantly associated with mortality, older age, severe PAD at diagnosis, diabetes, heart failure, renal failure and COPD were the strongest predictors of long-term mortality. Black race was associated with lower hazard of long-term mortality. Conclusions Using a previously validated novel NLP algorithm that extracts ABI and TBI values from clinical documents, we have successfully developed a registry of newly diagnosed PAD in the VHA. Ongoing studies from our cohort will yield important insights into the association between clinical management of PAD and clinical outcomes, with the goal of identifying opportunities for improving care.