Abstract
Non-alcoholic fatty liver disease (NAFLD) is a chronic liver disease that presents a great challenge for treatment and prevention.. This study aims to implement a machine learning approach that employs such datasets to identify potential biomarker targets. We developed a pipeline to identify potential biomarkers for NAFLD that includes five major processes, namely, a pre-processing step, a feature selection and a generation of a random forest model and, finally, a downstream feature analysis and a provision of a potential biological interpretation. The pre-processing step includes data normalising and variable extraction accompanied by appropriate annotations. A feature selection based on a differential gene expression analysis is then conducted to identify significant features and then employ them to generate a random forest model whose performance is assessed based on a receiver operating characteristic curve. Next, the features are subjected to a downstream analysis, such as univariate analysis, a pathway enrichment analysis, a network analysis and a generation of correlation plots, boxplots and heatmaps. Once the results are obtained, the biological interpretation and the literature validation is conducted over the identified features and results. We applied this pipeline to transcriptomics and lipidomic datasets and concluded that the C4BPA gene could play a role in the development of NAFLD. The activation of the complement pathway, due to the downregulation of the C4BPA gene, leads to an increase in triglyceride content, which might further render the lipid metabolism. This approach identified the C4BPA gene, an inhibitor of the complement pathway, as a potential biomarker for the development of NAFLD.
Highlights
Introduction iationsNon-alcoholic fatty liver disease (NAFLD) is a form of chronic liver disease that affects20–30% of the western population and approximately 25% of the global population [1–3]NAFLD is associated with a wide range of diseases, including increased visceral obesity and metabolomic abnormalities, such as insulin resistance, diabetes, hypertension, dyslipidemia, atherosclerosis and systemic micro-inflammation [4–9]
The data were split into training, testing and validation sets and were subjected to pre-processing, normalization, data integration, batch-effect correction, principal component analysis (PCA) analysis, differential gene expression analysis, identification of common significant genes, as well as supervised analysis using random forest and biological interpretation
A differential gene expression analysis was conducted over the derived transcriptomics datasets
Summary
Introduction iationsNon-alcoholic fatty liver disease (NAFLD) is a form of chronic liver disease that affects20–30% of the western population and approximately 25% of the global population [1–3]NAFLD is associated with a wide range of diseases, including increased visceral obesity and metabolomic abnormalities, such as insulin resistance, diabetes, hypertension, dyslipidemia, atherosclerosis and systemic micro-inflammation [4–9]. Non-alcoholic fatty liver disease (NAFLD) is a form of chronic liver disease that affects. 20–30% of the western population and approximately 25% of the global population [1–3]. NAFLD is associated with a wide range of diseases, including increased visceral obesity and metabolomic abnormalities, such as insulin resistance, diabetes, hypertension, dyslipidemia, atherosclerosis and systemic micro-inflammation [4–9]. Enhanced by an inactive lifestyle and unhealthy food culture, the spread of NAFLD has increased across countries among different age groups [4,10]. The disease has increased from 15% in 2005 to 25% in 2010 with a subsequent increase in the number of obesity cases [11]. It is anticipated that there will be an increase in the number of NAFLD cases from 83.1 million (2015) to 100.9 million (2030) [12].
Talk to us
Join us for a 30 min session where you can share your feedback and ask us any queries you have
Disclaimer: All third-party content on this website/platform is and will remain the property of their respective owners and is provided on "as is" basis without any warranties, express or implied. Use of third-party content does not indicate any affiliation, sponsorship with or endorsement by them. Any references to third-party content is to identify the corresponding services and shall be considered fair use under The CopyrightLaw.