Validation Data Subsets Research Articles

This study presents an acoustic-based predictive modeling framework for estimating a suite of wood fiber attributes within jack pine (Pinus banksiana Lamb.) logs for informing in-forest log-segregation decision-making. Specifically, the relationships between acoustic velocity (longitudinal stress wave velocity; vl) and the dynamic modulus of elasticity (me), wood density (wd), microfibril angle (ma), tracheid wall thickness (wt), tracheid radial and tangential diameters (dr and dt, respectively), fiber coarseness (co), and specific surface area (sa), were parameterized deploying hierarchical mixed-effects model specifications and subsequently evaluated on their resultant goodness-of-fit, lack-of-fit, and predictive precision. Procedurally, the data acquisition phase involved: (1) randomly selecting 61 semi-mature sample trees within ten variable-sized plots established in unthinned and thinned compartments of four natural-origin stands situated in the central portion the Canadian Boreal Forest Region; (2) felling and sectioning each sample tree into four equal-length logs and obtaining twice-replicate vl measurements at the bottom and top cross-sectional faces of each log (n = 4) from which a log-specific mean vl value was calculated; and (3) sectioning each log at its midpoint and obtaining a cross-sectional sample disk from which a 2 × 2 cm bark-to-pith radial xylem sample was extracted and subsequently processed via SilviScan-3 to derive annual-ring-specific attribute values. The analytical phase involved: (1) stratifying the resultant attribute—acoustic velocity observational pairs for the 243 sample logs into approximately equal-sized calibration and validation data subsets; (2) parameterizing the attribute—acoustic relationships employing mixed-effects hierarchical linear regression specifications using the calibration data subset; and (3) evaluating the resultant models using the validation data subset via the deployment of suite of statistical-based metrics pertinent to the evaluation of the underlying assumptions and predictive performance. The results indicated that apart from tracheid diameters (dr and dt), the regression models were significant (p ≤ 0.05) and unbiased predictors which adhered to the underlying parameterization assumptions. However, the relationships varied widely in terms of explanatory power (index-of-fit ranking: wt (0.53) > me > sa > co > wd >> ma (0.08)) and predictive ability (sa > wt > wd > co >> me >>> ma). Likewise, based on simulations where an acoustic-based wd estimate is used as a surrogate measure for a Silviscan-equivalent value for a newly sampled log, predictive ability also varied by attribute: 95% of all future predictions for sa, wt, co, me, and ma would be within ±12%, ±14%, ±15%, ±27%, and ±55% and of the true values, respectively. Both the limitations and potential utility of these predictive relationships for use in log-segregation decision-making, are discussed. Future research initiatives, consisting of identifying and controlling extraneous sources of variation on acoustic velocity and establishing attribute-specific end-product-based design specifications, would be conducive to advancing the acoustic approach in boreal forest management.

Read full abstract

Diffuse large B-cell lymphoma (DLBCL) can be classified into germinal center-like (GCB) and non-germinal center-like (non-GCB) molecular subtype. These entities are driven by different intracellular oncogenic signaling pathways that lead to a distinct clinical outcome (Fang, Xu, & Li, 2010; Lenz et al., 2008). Several immunohistochemical (IHC)-based DLBCL classification algorithms have been proposed, this considers the case when genetic expression profile (GEP) studies are not available. However, there is a major discrepancy within IHC algorithms, and when they are compared to GEP (Coutinho et al., 2013).To address these inconsistencies and determine if an automatic classifier could be used to accurately categorize DLBCL subtype, we perfomed a present a performance comparison between eight reported IHC algorithms (Colomo, Hans, Hans modified [Hans*], Nyman, Choi, Choi modified [Choi*] and Visco-Young with three [VY3] and four [VY4] antibodies) against their counterparts developed by automatic classification techniques, which consider the following structures: Bayesian Classifier (B), Bayesian Simple Classifier (BS), Naïve Bayesian Classifier (BN), Artificial Neural Networks (ANN), and Support Vector Machines (SVM).The Visco-Young database (Visco et al., 2012), which contains GEP, IHC raw data corresponding to GCET1, MUM1, FOXP1, BCL6, and CD10 antibodies, and clinical information of 475 de novo DLBCL patients, was used. According to GEP, the database contained 231 GCB, and 244 non-GCB cases. Each patient in VY database was ranked by survival rate as: low survival (0 - 34 months, 237 patients), medium survival (35 - 69 months, 173 patients), or high survival (70 - 106 months, 65 patients) rate. For the implementation of automatic classifiers, the database was split into training, testing and validation data subsets (75%, 20% and 5% respectively) by random selection, but to preserve the same proportion of ranked patients, the so-called k-fold cross-validation technique was applied. The automatic classifier versions of IHC algorithms used the same raw IHC data (antibody combination) as the input, e.g. VY3 used CD10, FOXP1, and BCL6 raw IHC as well as the ANN VY3. A total of 35 automatic classifiers were trained, where Colomo and Hans use the same set of antibodies and are represented by the same automatic classifiers. The stopping criterion during the training stage for all classification algorithms was an error less than 1x10-3 or 100 training epochs, whichever was satisfied first.The performance of the eight IHC algorithms and the automatic classifiers was evaluated by computing the accuracy (Acc), specificity (Spec), and sensitivity (Sens), according to the Receiver Operating Characteristic procedure. Five classifiers obtained the highest metrics: ANN Choi, BS Choi, and BS Choi* with 94.2% Acc, 93.1% Spec, and 95.2% Sens, followed by SVM Choi and SVM VY4 with 94.2% Acc, 91.4% Spec, and 96.8% Sens. Choi was the IHC algorithm with better metrics (92.5% Acc, 84.5% Spec, and 100% Sens), which ranked 11 out of 43 models tested, followed by VY3 and VY4 (ranked 22 and 23, respectively). Survival of GCB and non-GCB groups identified by these models were compared using Kaplan-Meier curves, and the significance was calculated using log-rank test. For the best five automatic classifiers and the Choi IHC algorithm, GCB overall survival was better than non-GCB cases (p < 0.05).To statistically compare the models with GEP, all automatic classifiers and IHC algorithms results were analyzed by Cohen’s kappa (κ) for agreement analysis and Pearson’s chi-squared test. Only Choi IHC algorithm had a very good agreement when compared with GEP (κ = 0.85, p < 0.001). The best five automatic supervised classifiers provided a perfect agreement with GEP (κ = 0.88, p < 0.001). Moreover, the agreement between IHC algorithms was mainly from moderate to good (κ: 0.41 - 0.79), except for Choi having a very good agreement with both VY3 and VY4 (κ = 0.95, p < 0.001). Conversely, a very good agreement within supervised classifiers was observed (κ: 0.77 - 1.00).Harnessing all of the available immunohistochemical data in order to increase the DLBCL classification accuracy when compared with decision three pre-existing algorithms, we conclude that 4 antibody-based BS Choi* automatic classifier provided the best metrics and represents an affordable and time-saving alternative for DLBCL molecular subtype identification. DisclosuresNo relevant conflicts of interest to declare.

Read full abstract

Validation Data Subsets Research Articles

Related Topics

Articles published on Validation Data Subsets

Development of a machine learning model for systematics of Aspergillus section Nigri using synchrotron radiation-based fourier transform infrared spectroscopy

Development of Spatiotemporal Whole-Stem Models for Estimating End-Product-Based Fibre Attribute Determinates for Jack Pine and Red Pine

Predicting uniaxial compressive strength from drilling variables aided by hybrid machine learning

Metabolome-Based Classification of Snake Venoms by Bioinformatic Tools.

Stable downward continuation of the gravity potential field implemented using deep learning

Machine Learning Strategies for the Retrieval of Leaf-Chlorophyll Dynamics: Model Choice, Sequential Versus Retraining Learning, and Hyperspectral Predictors.

Modelling Stand Variables of Beech Coppice Forest Using Spectral Sentinel-2A Data and the Machine Learning Approach

Age estimation in a long-lived seabird (Ardenna tenuirostris) using DNA methylation-based biomarkers.

Acoustic Velocity—Wood Fiber Attribute Relationships for Jack Pine Logs and Their Potential Utility

Molecular Subtype Classification of Diffuse Large B-Cell Lymphoma By Immunohistochemical Algorithms and Automatic Supervised Classifiers

Genomic Prediction Accounting for Residual Heteroskedasticity.

EuroSCORE II

Abstract 3276: Assessment of Predictive Accuracy of Centers for Medicare and Medicaid Services’ Method to Risk Adjust Patients for Interhospital Comparison of 30-day Mortality Rates

Retrieval of pigment concentrations and size structure of algal populations from their absorption spectra using multilayered perceptrons

An Improved Outdoor Calibration Procedure for BroadbandUltraviolet Radiometers

Estimation of sensible and latent heat flux from natural sparse vegetation surfaces using surface renewal

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Validation Data Subsets Research Articles

Related Topics

Articles published on Validation Data Subsets

Development of a machine learning model for systematics of Aspergillus section Nigri using synchrotron radiation-based fourier transform infrared spectroscopy

Development of Spatiotemporal Whole-Stem Models for Estimating End-Product-Based Fibre Attribute Determinates for Jack Pine and Red Pine

Predicting uniaxial compressive strength from drilling variables aided by hybrid machine learning

Metabolome-Based Classification of Snake Venoms by Bioinformatic Tools.

Stable downward continuation of the gravity potential field implemented using deep learning

Machine Learning Strategies for the Retrieval of Leaf-Chlorophyll Dynamics: Model Choice, Sequential Versus Retraining Learning, and Hyperspectral Predictors.

Modelling Stand Variables of Beech Coppice Forest Using Spectral Sentinel-2A Data and the Machine Learning Approach

Age estimation in a long-lived seabird (Ardenna tenuirostris) using DNA methylation-based biomarkers.

Acoustic Velocity—Wood Fiber Attribute Relationships for Jack Pine Logs and Their Potential Utility

Molecular Subtype Classification of Diffuse Large B-Cell Lymphoma By Immunohistochemical Algorithms and Automatic Supervised Classifiers

Genomic Prediction Accounting for Residual Heteroskedasticity.

EuroSCORE II

Abstract 3276: Assessment of Predictive Accuracy of Centers for Medicare and Medicaid Services’ Method to Risk Adjust Patients for Interhospital Comparison of 30-day Mortality Rates

Retrieval of pigment concentrations and size structure of algal populations from their absorption spectra using multilayered perceptrons

An Improved Outdoor Calibration Procedure for BroadbandUltraviolet Radiometers

Estimation of sensible and latent heat flux from natural sparse vegetation surfaces using surface renewal