Independent Validation Data Research Articles

Background: It is important to recognize severe ill coronavirus disease 2019 (COVID-19) patients from moderate ones in order to save more lives. We attempted to present an predictor for disease severity from clinical laboratory markers using sparse principal component analysis method. Methods: Forty-four clinical characteristics and laboratory markers of 82 COVID-19 patients (Hefei cohort) from the First Affiliated Hospital of University of Science and Technology of China (USTC) were analyzed retrospectively and sparse principal component analysis (SPCA) was performed to examine the correlation between the markers and extract relevant features. The controlling parameter alpha of SPCA was adjusted for better variable selection. Then the produced principal components (PCs) by SPCA were subjected to multivariate logistic regression for disease severity prediction, and the significant PCs were selected. Then, a Lymphocyte-Monocyte-Neutrophil index (LMN index) was deduced from the significant PCs and was used for disease severity prediction. Furthermore, an independent cohort including 169 COVID-19 patients (Nanchang Cohort) from the First Affiliated Hospital of NanChang University was used as a validation dataset and prediction efficiency of LMN index and classical clinical markers were also evaluated. Findings: Using SPCA, the first to thirteenth PCs accounted for 81·7% of the cumulative proportion variance of the original 44 clinical characteristics and laboratory markers. Multivariate logistic regression revealed the PC1 was significantly associated with disease severity with odds ratio of 74272·28 (623·83 - 178483250). When the controlling parameter alpha was adjusted to 0·001, the PC1 is only dependent on five laboratory markers: lymphocyte count (LYM), lymphocyte percentage (LYM%), neutrophil count (NEU), monocyte count (MONO) ,and serum phosphorus. LMN index determined by LYM, LYM%, NEU ,and MONO was deduced from the PC1 and significant relationships were investigated between LMN indices with age, comorbidity status and CD4+ ,and CD8 T lymphocyte counts. More important, during hospitalization, LMN indices decreased obviously as treatment takes effect, and they declined more sharply for mild ill COVID-19 patients compared with those of severe ill ones. When used to predict disease progression, the LMN index gave the accuracy of 0·780 and 0·760 in the training data (Hefei cohort) and the independent validation data (Nanchang Cohort) respectively, which was more efficient than classical clinical markers. Interpretation: Using SPCA method, the LMN index determined by four blood routine test markers was deduced. It showed robust disease severity prediction efficiency of COVID-19 patients and have the potential for clinical applications. Funding Statement: Fundamental Research Funds for the Central Universities of China(No. YD9110002001). Declaration of Interests: XLM reports grants from Fundamental Research Funds for the Central Universities of China. All other authors declare no competing interests. Ethics Approval Statement: This study was approved by the Ethics Committee of the First Affiliated Hospital of USTC and the Ethics Committee of the First Affiliated Hospital of NanChang University.

Read full abstract

ABSTRACT The South Asia (India, Pakistan, Bangladesh, Nepal, Sri Lanka and Bhutan) has a staggering 900 million people (~43% of the population) who face food insecurity or severe food insecurity as per United Nations, Food and Agriculture Organization’s (FAO) the Food Insecurity Experience Scale (FIES). The existing coarse-resolution (≥250-m) cropland maps lack precision in geo-location of individual farms and have low map accuracies. This also results in uncertainties in cropland areas calculated from such products. Thereby, the overarching goal of this study was to develop a high spatial resolution (30-m or better) baseline cropland extent product of South Asia for the year 2015 using Landsat satellite time-series big-data and machine learning algorithms (MLAs) on the Google Earth Engine (GEE) cloud computing platform. To eliminate the impact of clouds, 10 time-composited Landsat bands (blue, green, red, NIR, SWIR1, SWIR2, Thermal, EVI, NDVI, NDWI) were derived for each of the three time-periods over 12 months (monsoon: Days of the Year (DOY) 151–300; winter: DOY 301–365 plus 1–60; and summer: DOY 61–150), taking the every 8-day data from Landsat-8 and 7 for the years 2013–2015, for a total of 30-bands plus global digital elevation model (GDEM) derived slope band. This 31-band mega-file big data-cube was composed for each of the five agro-ecological zones (AEZ’s) of South Asia and formed a baseline data for image classification and analysis. Knowledge-base for the Random Forest (RF) MLAs were developed using spatially well spread-out reference training data (N = 2179) in five AEZs. The classification was performed on GEE for each of the five AEZs using well-established knowledge-base and RF MLAs on the cloud. Map accuracies were measured using independent validation data (N = 1185). The survey showed that the South Asia cropland product had a producer’s accuracy of 89.9% (errors of omissions of 10.1%), user’s accuracy of 95.3% (errors of commission of 4.7%) and an overall accuracy of 88.7%. The National and sub-national (districts) areas computed from this cropland extent product explained 80-96% variability when compared with the National statistics of the South Asian Countries. The full-resolution imagery can be viewed at full-resolution, by zooming-in to any location in South Asia or the world, at www.croplands.org and the cropland products of South Asia downloaded from The Land Processes Distributed Active Archive Center (LP DAAC) of National Aeronautics and Space Administration (NASA) and the United States Geological Survey (USGS): https://lpdaac.usgs.gov/products/gfsad30saafgircev001/.

Read full abstract

Independent Validation Data Research Articles

Related Topics

Articles published on Independent Validation Data

Single-center versus multi-center data sets for molecular prognostic modeling: a simulation study

Automatic Triage of 12‐Lead ECGs Using Deep Convolutional Neural Networks

A comprehensive review and performance evaluation of bioinformatics tools for HLA class I peptide-binding prediction.

Lymphocyte-Monocyte-Neutrophil Index: A Predictor of Severity of Coronavirus Disease 2019 Patients Produced by Sparse Principal Component Analysis

Augmenting Interpretation of Chest Radiographs With Deep Learning Probability Maps.

A Validated Model to Predict Postoperative Symptom Severity After Mandibular Third Molar Removal

Establishment of Multiple Myeloma Diagnostic Model Based on Logistic Regression in Clinical Laboratory.

Comprehensive analysis of gene expression and DNA methylation data identifies potential biomarkers and functional epigenetic modules for lung adenocarcinoma.

Detection of Gait From Continuous Inertial Sensor Data Using Harmonic Frequencies.

Urothelial Carcinoma Detection Based on Copy Number Profiles of Urinary Cell-Free DNA by Shallow Whole-Genome Sequencing.

Fine-Resolution Mapping of Soil Total Nitrogen across China Based on Weighted Model Averaging

Agricultural cropland extent and areas of South Asia derived using Landsat satellite 30-m time-series big-data using random forest machine learning algorithms on the Google Earth Engine cloud

Assessing the impact of an invasive bryophyte on plant species richness using high resolution imaging spectroscopy

Predicting forest understory habitat for Canada lynx using LIDAR data

Comparison of Fixed- and Mixed-Effects Approaches to Taper Modeling for Scots Pine in West Poland

The Influence of Temperature and Community Structure on Light Absorption by Phytoplankton in the North Atlantic.

Three Novel Players: PTK2B, SYK, and TNFRSF21 Were Identified to Be Involved in the Regulation of Bovine Mastitis Susceptibility via GWAS and Post-transcriptional Analysis.

Remote Sensing to Detect Nests of the Leaf-Cutting Ant Atta sexdens (Hymenoptera: Formicidae) in Teak Plantations

A hybrid random forest to predict soccer matches in international tournaments

Enhancing VGI application semantics by accounting for spatial bias

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Independent Validation Data Research Articles

Related Topics

Articles published on Independent Validation Data

Single-center versus multi-center data sets for molecular prognostic modeling: a simulation study

Automatic Triage of 12‐Lead ECGs Using Deep Convolutional Neural Networks

A comprehensive review and performance evaluation of bioinformatics tools for HLA class I peptide-binding prediction.

Lymphocyte-Monocyte-Neutrophil Index: A Predictor of Severity of Coronavirus Disease 2019 Patients Produced by Sparse Principal Component Analysis

Augmenting Interpretation of Chest Radiographs With Deep Learning Probability Maps.

A Validated Model to Predict Postoperative Symptom Severity After Mandibular Third Molar Removal

Establishment of Multiple Myeloma Diagnostic Model Based on Logistic Regression in Clinical Laboratory.

Comprehensive analysis of gene expression and DNA methylation data identifies potential biomarkers and functional epigenetic modules for lung adenocarcinoma.

Detection of Gait From Continuous Inertial Sensor Data Using Harmonic Frequencies.

Urothelial Carcinoma Detection Based on Copy Number Profiles of Urinary Cell-Free DNA by Shallow Whole-Genome Sequencing.

Fine-Resolution Mapping of Soil Total Nitrogen across China Based on Weighted Model Averaging

Agricultural cropland extent and areas of South Asia derived using Landsat satellite 30-m time-series big-data using random forest machine learning algorithms on the Google Earth Engine cloud

Assessing the impact of an invasive bryophyte on plant species richness using high resolution imaging spectroscopy

Predicting forest understory habitat for Canada lynx using LIDAR data

Comparison of Fixed- and Mixed-Effects Approaches to Taper Modeling for Scots Pine in West Poland

The Influence of Temperature and Community Structure on Light Absorption by Phytoplankton in the North Atlantic.

Three Novel Players: PTK2B, SYK, and TNFRSF21 Were Identified to Be Involved in the Regulation of Bovine Mastitis Susceptibility via GWAS and Post-transcriptional Analysis.

Remote Sensing to Detect Nests of the Leaf-Cutting Ant Atta sexdens (Hymenoptera: Formicidae) in Teak Plantations

A hybrid random forest to predict soccer matches in international tournaments

Enhancing VGI application semantics by accounting for spatial bias