Abstract

Atherosclerosis is the underlying pathology in a major part of cardiovascular disease, the leading cause of mortality in developed countries. The infiltration of monocytes into the vessel walls of large arteries is a key denominator of atherogenesis, making monocytes accountable for the development of atherosclerosis. With the development of high-throughput transcriptome profiling platforms and cytometric methods for circulating cells, it is now feasible to study in-depth the predicted functional change of circulating monocytes reflected by changes of gene expression in certain pathways and correlate the changes to disease outcome. Neuroimmune guidance cues comprise a group of circulating- and cell membrane-associated signaling proteins that are progressively involved in monocyte functions. Here, we employed the CIRCULATING CELLS study cohort to classify cardiovascular disease patients and healthy individuals in relation to their expression of neuroimmune guidance cues in circulating monocytes. To cope with the complexity of human datasets featured by noisy data, nonlinearity and multidimensionality, we assessed various machine-learning methods. Of these, the linear discriminant analysis, Naïve Bayesian model and stochastic gradient boost model yielded perfect or near-perfect sensibility and specificity and revealed that expression levels of the neuroimmune guidance cues SEMA6B, SEMA6D and EPHA2 in circulating monocytes were of predictive values for cardiovascular disease outcome.

Highlights

  • Cardiovascular diseases (CVD) remain a leading cause of death in the more economically developed countries, despite improvements in surgical and drug treatments

  • By applying different machine-learning methods on the gene expression data of peripheral monocytes from the CIRCULATING CELLS study cohort, we investigated whether monocytic Neuroimmune guidance cues (NGCs) expression is informative to distinguish between healthy individuals and CVD patients

  • (4) Linear discriminant analysis, Naïve Bayesian and stochastic gradient boosting models performed best compared with the other models, within both the training set and test set at an accuracy of more than 0.98 and Cohen’s Kappa more than 0.75 in the test set. These results indicate that the linear discriminant analysis, Naïve Bayesian and stochastic gradient boosting models were able to translate the informative part of NGC expression data into disease outcome

Read more

Summary

Introduction

Cardiovascular diseases (CVD) remain a leading cause of death in the more economically developed countries, despite improvements in surgical and drug treatments. By applying different machine-learning methods ( known as predictive modeling methods) on the gene expression data of peripheral monocytes from the CIRCULATING CELLS study cohort, we investigated whether monocytic NGC expression is informative to distinguish between healthy individuals and CVD patients. CulginhictatlochinarcalcutdereistNicGs oCf sthwe CitIhRChUigLhAeTxINpGreCssEiLoLnSlceovheolrst.and good univariate correlation wiCthlitnhiecaoluCthcoarmacet.erFiisgtiucrse 2a shows the eAxlplression of NGCCVsDin the cohoHret.alBthasyed on the detection threshoDldemofogthraephpircodfialtiang platform, NGCs with signals higher than 6.75 (log scale unless specified otherNwuisme)bwere(rme aulne/cfoenmdailteio) nally includ3e6d8i(n27th3/e95m) odelin3g56as(2p6o4/t9e2n)tial feat1u2re(s9./3T)o validate the microarray analysis,Awgeecompared the monocy6ti1c.8N(G±1C0.e4x)pressio6n2.p2r(o±fi1l0e.3o)btained49b.2y(m±6i.c3r)oarray to that obtained by real-timBeMPICR Both methods sho2w7e.3d(±a4s.i3m) ilar exp2r7e.s4s(io±4n.3p)rofile, w23i.t7h(t±h2e.3e)xception of SEMA3E (FiguCroeroSn1a).ryTorisgkafianctuornsderstanding of univariate correlation of the features to the outcome, violin plots of the HNyGpCerteexnpsrioesnsions were create2d31to(6c3o%m)pare the23d1is(t6r5i%bu)tion of N0G(0C%)expression levels in both the CVD group and healthy group (Figure 2b and Table S1). Regardless, sex and age were included in our modeling process, as it is common practice to control for these conventional confounding factors

Performance of Different Models
Features with the Most Importance in the Models
Discussion
Study Population
Isolation of Peripheral Blood CD14-Positive Monocytes
RNA Isolation and Microarray Analysis
Statistical Analysis
Findings
Model Fitting and Assessment of Model Performance
Full Text
Published version (Free)

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call