Identifying non-invasive blood-based biomarkers is crucial for early detection and monitoring of liver cancer (LC), thereby improving patient outcomes. This study leveraged computational approaches to predict potential blood-based biomarkers for LC. Machine learning (ML) models were developed using selected features from blood-secretory proteins collected from the curated databases. The logistic regression (LR) model demonstrated the optimal performance. Transcriptome analysis across 7 LC cohorts revealed 231 common differentially expressed genes (DEGs). The encoded proteins of these DEGs were compared with the ML dataset, revealing 29 proteins overlapping with the blood-secretory dataset. The LR model also predicted 29 additional proteins as blood-secretory with the remaining protein-coding genes. As a result, 58 potential blood-secretory proteins were obtained. Among the top 20 genes, 13 common hub genes were identified. Further, area under the receiver operating characteristic curve (ROC AUC) analysis was performed to assess the genes as potential diagnostic blood biomarkers. Six genes, ESM1, FCN2, MDK, GPC3, CTHRC1 and COL6A6, exhibited an AUC value higher than 0.85 and were predicted as blood-secretory. This study highlights the potential of an integrative computational approach for discovering non-invasive blood-based biomarkers in LC, facilitating for further validation and clinical translation. SignificanceLiver cancer is one of the leading causes of premature death worldwide, with its prevalence and mortality rates projected to increase. Although current diagnostic methods are highly sensitive, they are invasive and unsuitable for repeated testing. Blood biomarkers offer a promising non-invasive alternative, but their wide dynamic range of protein concentration poses experimental challenges. Therefore, utilizing available omics data to develop a diagnostic model could provide a potential solution for accurate diagnosis. This study developed a computational method integrating machine learning and bioinformatics analysis to identify potential blood biomarkers. As a result, ESM1, FCN2, MDK, GPC3, CTHRC1 and COL6A6 biomarkers were identified, holding significant promise for improving diagnosis and understanding of liver cancer. The integrated method can be applied to other cancers, offering a possible solution for early detection and improved patient outcomes.
Read full abstract