Abstract Background: Previous lung cancer models were mostly based on case control studies and used limited epidemiological risk factors. The clinical utility was hampered due to lack of added value beyond smoking history. In this study, we developed a series of lung cancer risk prediction models using extensive data, including lab testing data, from a large prospective cohort in Taiwan. In particular, we developed a risk prediction model within heavy smokers with ≧30 pack years that could further stratify this group and improve the screening efficiency of Low Dose Computed Tomography (LDCT). Methods: The cohort participants were divided into equal training and validation sets, by using variables from personal history and standard lab tests. Cox proportional hazards regression and goodness of fit were used to generate area under the curve (AUC). Results: A total of 1,774 incident lung cancer cases were identified among 505,030 participants recruited in a 14-year period during 1994-2007. We developed a “History only” model that included smoking status, body mass index (BMI), family history, and personal history; a “lab only” model that included blood in sputum, lung function test and serum biomarkers like carcinoembryonic antigen (CEA), bilirubin, and alpha fetoprotein (AFP); and an integrative model with all these variables. The AUC for “history only” model was 0.837, “lab only” model, 0.839, and integrative model, 0.862. We also developed integrative models for ever-smokers and never-smokers with AUCs of 0.872 and 0.801, respectively. Furthermore, since LDCT is recommended for heavy smokers with ≥ 30 smoking pack-years but with a high false positive rate, we developed a risk model in this group to further stratify this group. A total of 454 incident lung cancer cases were identified among 23,582 heavy smokers. Six variables are significantly associated with lung cancer risk: age, smoking status (current or former), smoking intensity (number of packs smoked in a day), BMI, lung function test and CEA. Current smokers, who smoked >1 pack per day, with BMI <25, lowest lung function and highest CEA, had the highest risk. The number of smokers needed to screen to identify one lung cancer (NNS) could decrease to 13 in this high-risk heavy smokers group from 52 in heavy smokers as a whole, a four-fold increase in risk identification. Highest risks were found among smokers with CEA ≧3.8 ug/L (NNS between 13 and 37) and smokers with lung function MMF<43 ml/sec (NNS between 13 and 25). Conclusion: Routinely collected lab data can significantly improve lung cancer risk prediction models. An integrative lung cancer risk prediction model among heavy smokers with ≧30 pack years can stratify subjects with a 3-4 fold difference in lung cancer risk for LDCT screening. The model found decreased lung function and high CEA values, along with detailed smoking history and low BMI, offered a unique predictive ability to improve the efficiency of LDCT screening. Citation Format: Xifeng Wu, Chi Pang Wen, Yuanqing Ye, MinKwang Tsai, Xia Pu, Wong-Ho Chow, Chad Huff, Sonia Cunningham, Maosheng Huang, Shiuan Bei Wu, Chwen Keng Tsao, Jian Gu. The use of lung cancer risk prediction model to stratify heavy smokers: The power of routinely collected lab data. [abstract]. In: Proceedings of the 105th Annual Meeting of the American Association for Cancer Research; 2014 Apr 5-9; San Diego, CA. Philadelphia (PA): AACR; Cancer Res 2014;74(19 Suppl):Abstract nr LB-298. doi:10.1158/1538-7445.AM2014-LB-298
Read full abstract