Abstract Background Emerging studies have reported that approximately 30% of patients in stage I/II die within 5 years due to the progression and recurrence. There is a great need to identify sensitive and specific non-invasive biomarkers for the prognosis and survival prediction in early-stage lung cancer. Methods The strategy of extreme phenotype was applied for the quick identification of biomarkers associated with early-stage lung cancer survival in Boston Lung Cancer Study (BLCS) cohort. Multiple omics platforms, including SOMAscan and Infinium MethylationEPIC Array, were carried out to recognize biomarkers from the circulating (blood or serum) and solid (tissue) in different molecular levels. Public databases (e.g., TCGA, HPA and GEO) were used to be external validations. Differential analysis was performed by t-test with normalization data. Kaplan-Meier curves with the log-rank test were used to plot and compare the survival between candidate groups. Results Among the lung cancer patients in stage I registered in BLCS, we selected randomly 77 samples for omics detection, of which 37 had long time survival (mean = 157.36 mos) and 40 had short survival (mean = 19.79 mos). At the protein level, we found 120 circulating differentially expressed proteins (C-DEPs) between two extreme groups (P < 0.05), and six of them were beyond the fold change (FC) > 2 [i.e., ApoB (APOB), CK-MB (CKB CKM) and CK-MM (CKM) are up in the long; GAPDH (GAPDH), RAN (RAN) and SPHK1 (SPHK1) are up in the short]. As for their lung cancer tissues, both ApoB (APOB) and CK-MM (CKM) rarely expressed at the RNA and protein levels. Thus, we considered both ApoB and CK-MM as serum-specific proteins and the others as widespread biomarkers. At the DNA methylation level, 19,084 CpGs were identified in circulating differentially methylated positions (C-DMPs) and 80,342 in tumor DMPs (T-DMPs) with P < 0.05. Next, we integrated C-DEPs, C-DMPs and T-DMPs to interpret the origin and expression pattern of specific biomarkers. Intriguingly, several T-DMPs and C-DMPs existed only in genes of serum-specific proteins (ApoB and CK-MM) along with high correlation, but not in the widespread biomarkers. These findings indicate two separate origins where the proteins specific to serum skipped the products of RNA and protein and were directly released from the solid tissue into the circulatory system, while the widespread proteins left RNA and protein behind in situ. Furthermore, among four widespread biomarkers, we observed that the RNA of SPHK1 highly expressed in lung cancer tumors and this higher expression was associated with worse survival (Plog-rank = 1.20 × 10−6), especially in stage I (Plog-rank = 1.20 × 10−5). Similar results were found with the SPHK1 protein in tumors. These findings reveal that the prognostic effect of SPHK1 at both RNA and protein levels was particularly limited to the patients in stage I, but not in advanced stages. Conclusion This is the first study to our knowledge using extreme a phenotype strategy to identify non-invasive prognostic biomarkers for early-stage lung cancer. Utilizing integration of multiple omic platform, it is plausible to recognize and evaluate the circulating biomarkers for prognosis assessment of lung cancer. Citation Format: Mulong Du, Qianyu Yuan, Li Su, Feng Chen, David C. Christiani. Integration of multiple omics underlying extreme phenotype strategy identify non-invasive prognostic biomarkers specific to early-stage lung cancer [abstract]. In: Proceedings of the Annual Meeting of the American Association for Cancer Research 2020; 2020 Apr 27-28 and Jun 22-24. Philadelphia (PA): AACR; Cancer Res 2020;80(16 Suppl):Abstract nr 5784.