Test Inputs Research Articles

Pediatric acute myeloid leukemia (pAML) encompasses over 20 molecular subtypes driven by unique genetic alterations, including hallmark chromosomal rearrangements and less frequently, point mutations or tandem duplications. Collectively, many of these pAML molecular categories are enriched in pediatric populations and are not represented by current classification systems, including the recently updated WHO or ICC. While fusion detection from RNA-Seq-based approaches is robust, many fusion negative subtypes would need to be defined by expression-based approaches as mutation calling from RNA-Seq data is less developed. Nevertheless, this could be challenging for subtypes with similar transcriptional profiles, such as those with shared HOX expression patterns, including NPM1, NUP98r, UBTF, DEK::NUP214, and KMT2A-PTD. To aid in the appropriate molecular classification of pAML, which is crucial for prognosis, we developed and compared three gene expression-based classifiers. A total of 1707 pAML gene expression profiles were mapped and analyzed from three distinct sources (St. Jude = 659; TARGET = 168; AAML1031 = 880). Raw read count data was normalized and scaled to obtain a relative expression value in transcripts per million (TPM), which served as input for feature selection, model training, and testing. Ground truth labels for all 1707 samples were obtained through multi-omics analysis, including whole genome sequencing, to identify fusions and mutations. For validation purposes, the data was stratified by subtypes and split 70/30 into training (n=1187) and testing (n=520). Three machine learning models were selected: random forest, XGboost, and linear support vector machine (SVM). Each sample had gene expression TPM data for 60,754 transcripts, out of which 20,004 transcripts related to protein-coding genes were incorporated for feature selection. Feature selection was performed using a median absolute deviation (MAD) algorithm to select the 3000 transcripts with the highest variability. Top variable 3000 genes were selected to allow for adequate tuning of the number of predictors. Each model was independently trained using stratified cross-validation and Monte-Carlo search for hyperparameter tuning. The best model was selected based on the Matthews Correlation Coefficient (MCC). Each model was tested on the hold-out TPM set with z-score normalization. On the hold-out testing set (n=520), the linear SVM model outperformed the random forest and XGboost models on five performance metrics across all subtypes (sensitivity=0.9577; precision=0.9577; specificity=0.9978; F1=0.96; accuracy=0.9958). The random forest (sensitivity=0.9154; precision=0.9154; specificity=0.9955; F1=0.92; accuracy=0.9915) and XGboost models (sensitivity=0.9231; precision=0.9231; specificity=0.9960; F1=0.92; accuracy=0.9923) also performed well across all subtypes. Although feature selection was shared across all three models, performance within each subtype varied between models. The linear SVM model demonstrated strong performance overall, driven by high specificity in classifying the KMT2Ar subgroup (n=127) and equal sensitivity across the GLIS-rearranged (n=14), GATA1 (n=10), BCL11B (n=7), CBFB::MYH11 (n=60), CEBPA (n=33), and RUNX1::RUNX1T1 (n=78) subtypes. The primary difference in performance between models is the high false positive rate for KMT2Ar and NPM1 (n=55) in the random forest and XGboost models. A preliminary hypothesis for this might be due to the large representation of KMT2Ar and NPM1 in the training data (24.85% and 10.78%, respectively). Synthetic upsampling (SMOTE) for the training dataset (n=1369) counteracts bias towards the majority classes and increases performance for the random forest (sensitivity=0.9327; precision=0.9327; specificity=0.9965; F1=0.93; accuracy=0.9933), XGboost (sensitivity=0.9404; precision=0.9404; specificity=0.9969; F1=0.94; accuracy=0.9940) and linear SVM (sensitivity=0.9615; precision=0.9615; specificity=0.9980; F1=0.96; accuracy=0.9962) models. Conjointly, these models demonstrate the utility and effectiveness of a machine learning approach for classifying pAML samples from transcriptome sequencing data, which may have broad clinical and research utility, especially for fusion negative subtypes.

Graph Neural Networks (GNNs) have achieved promising performance in a variety of practical applications. Similar to traditional DNNs, GNNs could exhibit incorrect behavior that may lead to severe consequences, and thus testing is necessary and crucial. However, labeling all the test inputs for GNNs can be costly and time-consuming, especially when dealing with large and complex graphs, which seriously affects the efficiency of GNN testing. Existing studies have focused on test prioritization for DNNs, which aims to identify and prioritize fault-revealing tests (i.e., test inputs that are more likely to be misclassified) to detect system bugs earlier in a limited time. Although some DNN prioritization approaches have been demonstrated effective, there is a significant problem when applying them to GNNs: They do not take into account the connections (edges) between GNN test inputs (nodes), which play a significant role in GNN inference. In general, DNN test inputs are independent of each other, while GNN test inputs are usually represented as a graph with complex relationships between each test. In this article, we propose GraphPrior ( GNN -oriented Test Prior itization), a set of approaches to prioritize test inputs specifically for GNNs via mutation analysis. Inspired by mutation testing in traditional software engineering, in which test suites are evaluated based on the mutants they kill, GraphPrior generates mutated models for GNNs and regards test inputs that kill many mutated models as more likely to be misclassified. Then, GraphPrior leverages the mutation results in two ways, killing-based and feature-based methods. When scoring a test input, the killing-based method considers each mutated model equally important, while feature-based methods learn different importance for each mutated model through ranking models. Finally, GraphPrior ranks all the test inputs based on their scores. We conducted an extensive study based on 604 subjects to evaluate GraphPrior on both natural and adversarial test inputs. The results demonstrate that KMGP, the killing-based GraphPrior approach, outperforms the compared approaches in a majority of cases, with an average improvement of 4.76% ~49.60% in terms of APFD. Furthermore, the feature-based GraphPrior approach, RFGP, performs the best among all the GraphPrior approaches. On adversarial test inputs, RFGP outperforms the compared approaches across different adversarial attacks, with the average improvement of 2.95% ~46.69%.

Test Inputs Research Articles

Related Topics

Articles published on Test Inputs

STTA: enhanced text classification via selective test-time augmentation.

Improved Test Input Prioritization Using Verification Monitors with False Prediction Cluster Centroids

Microcontroller Based Smart Energy Meter with Data Logger System

LMdist: Local Manifold distance accurately measures beta diversity in ecological gradients.

A novel approach for the fractional SLS material model experimental identification

Gene Expression Machine Learning Models Classify Pediatric AML Subtypes with High Performance

Problems on the theme "Maximum non-decreasing subsequence"

GraphPrior: Mutation-based Test Input Prioritization for Graph Neural Networks

Variationally mimetic operator networks

ENN: Hierarchical Image Classification Ensemble Neural Network for Large-Scale Automated Detection of Potential Design Infringement

Automated black-box boundary value detection

Generating and detecting true ambiguity: a forgotten danger in DNN supervision testing

Vehicle detection and classification using three variations of you only look once algorithm

Automatic Test Sequence Generation and Functional Coverage Measurement From UML Sequence Diagrams

Boosting Fuzzer Efficiency: An Information Theoretic Perspective

Automated and Efficient Test-Generation for Grid-Based Multiagent Systems

Fault localization in DSLTrans model transformations by combining symbolic execution and spectrum-based analysis

Clinical interpretation of cell-based non-invasive prenatal testing for monogenic disorders including repeat expansion disorders: potentials and pitfalls.

System Identification of Unmanned Air Systems at Texas A&M University

Physics-Guided Adversarial Machine Learning for Aircraft Systems Simulation

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Test Inputs Research Articles

Related Topics

Articles published on Test Inputs

STTA: enhanced text classification via selective test-time augmentation.

Improved Test Input Prioritization Using Verification Monitors with False Prediction Cluster Centroids

Microcontroller Based Smart Energy Meter with Data Logger System

LMdist: Local Manifold distance accurately measures beta diversity in ecological gradients.

A novel approach for the fractional SLS material model experimental identification

Gene Expression Machine Learning Models Classify Pediatric AML Subtypes with High Performance

Problems on the theme "Maximum non-decreasing subsequence"

GraphPrior: Mutation-based Test Input Prioritization for Graph Neural Networks

Variationally mimetic operator networks

ENN: Hierarchical Image Classification Ensemble Neural Network for Large-Scale Automated Detection of Potential Design Infringement

Automated black-box boundary value detection

Generating and detecting true ambiguity: a forgotten danger in DNN supervision testing

Vehicle detection and classification using three variations of you only look once algorithm

Automatic Test Sequence Generation and Functional Coverage Measurement From UML Sequence Diagrams

Boosting Fuzzer Efficiency: An Information Theoretic Perspective

Automated and Efficient Test-Generation for Grid-Based Multiagent Systems

Fault localization in DSLTrans model transformations by combining symbolic execution and spectrum-based analysis

Clinical interpretation of cell-based non-invasive prenatal testing for monogenic disorders including repeat expansion disorders: potentials and pitfalls.

System Identification of Unmanned Air Systems at Texas A&M University

Physics-Guided Adversarial Machine Learning for Aircraft Systems Simulation