Original Training Data Research Articles

BackgroundMachine learning is one kind of machine intelligence technique that learns from data and detects inherent patterns from large, complex datasets. Due to this capability, machine learning techniques are widely used in medical applications, especially where large-scale genomic and proteomic data are used. Cancer classification based on bio-molecular profiling data is a very important topic for medical applications since it improves the diagnostic accuracy of cancer and enables a successful culmination of cancer treatments. Hence, machine learning techniques are widely used in cancer detection and prognosis.MethodsIn this article, a new ensemble machine learning classification model named Multiple Filtering and Supervised Attribute Clustering algorithm based Ensemble Classification model (MFSAC-EC) is proposed which can handle class imbalance problem and high dimensionality of microarray datasets. This model first generates a number of bootstrapped datasets from the original training data where the oversampling procedure is applied to handle the class imbalance problem. The proposed MFSAC method is then applied to each of these bootstrapped datasets to generate sub-datasets, each of which contains a subset of the most relevant/informative attributes of the original dataset. The MFSAC method is a feature selection technique combining multiple filters with a new supervised attribute clustering algorithm. Then for every sub-dataset, a base classifier is constructed separately, and finally, the predictive accuracy of these base classifiers is combined using the majority voting technique forming the MFSAC-based ensemble classifier. Also, a number of most informative attributes are selected as important features based on their frequency of occurrence in these sub-datasets.ResultsTo assess the performance of the proposed MFSAC-EC model, it is applied on different high-dimensional microarray gene expression datasets for cancer sample classification. The proposed model is compared with well-known existing models to establish its effectiveness with respect to other models. From the experimental results, it has been found that the generalization performance/testing accuracy of the proposed classifier is significantly better compared to other well-known existing models. Apart from that, it has been also found that the proposed model can identify many important attributes/biomarker genes.

Read full abstract

Purpose The practice of artificial intelligence (AI) is increasingly being promoted by technology developers. However, its adoption rate is still reported as low in the construction industry due to a lack of expertise and the limited reliable applications for AI technology. Hence, this paper aims to present the detailed outcome of experimentations evaluating the applicability and the performance of AI object detection algorithms for construction modular object detection. Design/methodology/approach This paper provides a thorough evaluation of two deep learning algorithms for object detection, including the faster region-based convolutional neural network (faster RCNN) and single shot multi-box detector (SSD). Two types of metrics are also presented; first, the average recall and mean average precision by image pixels; second, the recall and precision by counting. To conduct the experiments using the selected algorithms, four infrastructure and building construction sites are chosen to collect the required data, including a total of 990 images of three different but common modular objects, including modular panels, safety barricades and site fences. Findings The results of the comprehensive evaluation of the algorithms show that the performance of faster RCNN and SSD depends on the context that detection occurs. Indeed, surrounding objects and the backgrounds of the objects affect the level of accuracy obtained from the AI analysis and may particularly effect precision and recall. The analysis of loss lines shows that the loss lines for selected objects depend on both their geometry and the image background. The results on selected objects show that faster RCNN offers higher accuracy than SSD for detection of selected objects. Research limitations/implications The results show that modular object detection is crucial in construction for the achievement of the required information for project quality and safety objectives. The detection process can significantly improve monitoring object installation progress in an accurate and machine-based manner avoiding human errors. The results of this paper are limited to three construction sites, but future investigations can cover more tasks or objects from different construction sites in a fully automated manner. Originality/value This paper’s originality lies in offering new AI applications in modular construction, using a large first-hand data set collected from three construction sites. Furthermore, the paper presents the scientific evaluation results of implementing recent object detection algorithms across a set of extended metrics using the original training and validation data sets to improve the generalisability of the experimentation. This paper also provides the practitioners and scholars with a workflow on AI applications in the modular context and the first-hand referencing data.

Read full abstract

Original Training Data Research Articles

Related Topics

Articles published on Original Training Data

Voice Conversion Based Augmentation and a Hybrid CNN-LSTM Model for Improving Speaker-Independent Keyword Recognition on Limited Datasets

A Mutual Guide Framework for Training Hyperspectral Image Classifiers With Small Data

Generative Adversarial Minority Oversampling for Spectral–Spatial Hyperspectral Image Classification

Conditional Generative Data-Free Knowledge Distillation

MODIFIED CORRELATION WEIGHT K-NEAREST NEIGHBOR CLASSIFIER USING TRAINING DATASET CLEANING METHOD

Combining Self-supervised Learning and Active Learning for Disfluency Detection

Polymorphic Adversarial Cyberattacks Using WGAN

Applying Data Augmentation and Mask R-CNN-Based Instance Segmentation Method for Mixed-Type Wafer Maps Defect Patterns Classification

Black-Box Diagnosis and Calibration on GAN Intra-Mode Collapse: A Pilot Study

Dual discriminator adversarial distillation for data-free model compression

Convolutional-network models to predict wall-bounded turbulence from wall quantities

An update to the custom-made MS Excel workbook performing the log-rank test with extended functionality and a new original COVID-19 training data set

Deep Learning with Data Augmentation to Add Data Around Classification Boundaries

An ensemble machine learning model based on multiple filtering and supervised attribute clustering algorithm for classifying cancer samples.

Privacy preservation in federated learning: An insightful survey from the GDPR perspective

Applications of object detection in modular construction based on a comparative evaluation of deep learning algorithms

Multilingual Transfer Learning for QA using Translation as Data Augmentation

Robustness to Spurious Correlations in Text Classification via Automatically Generated Counterfactuals

Continual Learning for Named Entity Recognition

An Improved Discriminative Model Prediction Approach to Real-Time Tracking of Objects With Camera as Sensors

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Original Training Data Research Articles

Related Topics

Articles published on Original Training Data

Voice Conversion Based Augmentation and a Hybrid CNN-LSTM Model for Improving Speaker-Independent Keyword Recognition on Limited Datasets

A Mutual Guide Framework for Training Hyperspectral Image Classifiers With Small Data

Generative Adversarial Minority Oversampling for Spectral–Spatial Hyperspectral Image Classification

Conditional Generative Data-Free Knowledge Distillation

MODIFIED CORRELATION WEIGHT K-NEAREST NEIGHBOR CLASSIFIER USING TRAINING DATASET CLEANING METHOD

Combining Self-supervised Learning and Active Learning for Disfluency Detection

Polymorphic Adversarial Cyberattacks Using WGAN

Applying Data Augmentation and Mask R-CNN-Based Instance Segmentation Method for Mixed-Type Wafer Maps Defect Patterns Classification

Black-Box Diagnosis and Calibration on GAN Intra-Mode Collapse: A Pilot Study

Dual discriminator adversarial distillation for data-free model compression

Convolutional-network models to predict wall-bounded turbulence from wall quantities

An update to the custom-made MS Excel workbook performing the log-rank test with extended functionality and a new original COVID-19 training data set

Deep Learning with Data Augmentation to Add Data Around Classification Boundaries

An ensemble machine learning model based on multiple filtering and supervised attribute clustering algorithm for classifying cancer samples.

Privacy preservation in federated learning: An insightful survey from the GDPR perspective

Applications of object detection in modular construction based on a comparative evaluation of deep learning algorithms

Multilingual Transfer Learning for QA using Translation as Data Augmentation

Robustness to Spurious Correlations in Text Classification via Automatically Generated Counterfactuals

Continual Learning for Named Entity Recognition

An Improved Discriminative Model Prediction Approach to Real-Time Tracking of Objects With Camera as Sensors