Majority Vote Algorithm Research Articles

Abstract Mutation detection, via genomic sequencing, has become a routine in cancer diagnosis and precision treatments. Although the existing bioinformatics approaches alter the algorithms and strategies for the detection, they always rely on the combinations of statistical features on a set of overlapped reads to recognize the real mutations from false positives. It may be the Achilles heel of such recognition mechanisms, that their preset thresholds are often stiff and ambiguous and the feature combinations with adaptive thresholds could only be enabled by machine learning frameworks. Unfortunately, the learning models have to overcome the challenges caused by tumor heterogeneity. For mutations from different subclones, most of the sequencing features associate with the tumor purity and clonal proportions. It introduces complicated interactions among the features and breaks the independent co-distribution assumption of classic learning models. In addition, both the tumor purity and clonal proportions are variables, thus, any training sets cannot enumerate all possible values. These challenges hurt the specificities of the existing approaches applying to cancer sequencing data. Here, we propose an approach for the scenario of various tumor purity and clonal proportions. The proposed approach incorporates a comprehensive set of the features according to the existing strategies. Then, it requires at least two training sets with different proportions. For any given set, we have fixed tumor purity and clonal proportions. The framework first trains the models according to one set. The trained models focus on the associations between the features and true mutations. These models are defined as a source domain. Next, when the other set is input for training, the framework not only trains another source domain, but focuses on the transformations among the features between the source domains as well. Now, when another fixed tumor purity and clonal proportions are considered, the framework is able to generate, maybe roughly, the models for the new group of the purity and proportions according to the source domains and the transformations. To enhance the performance, we propose to integrate a few source domains to control the systematic errors during the transfer processes. The Boyer-Moore majority-vote algorithm is introduced to achieve the integration. We have carried out a series of experiments on both simulated and real datasets, and compared to the state-of-the-art approaches, including MuTect2, Sentieon, VarScan2, Freebayes and SiNVICT. The results demonstrated that the proposed method adapts well to different diluted sequencing signals and can significantly reduce the false positive. It is implemented as TransVAF. The software package has been uploaded at https://github.com/TrinaZ/TL-fpFilter for academic usage only. Citation Format: Tian Zheng, Jiayin Wang, Xiao Xiao, Xiaoyan Zhu, Xuanping Zhang, Xin Lai, Yanfang Guan, Xin Yi. TransVAF: A transfer learning approach for recognize genomic mutations with various tumor purity and clonal proportions [abstract]. In: Proceedings of the American Association for Cancer Research Annual Meeting 2021; 2021 Apr 10-15 and May 17-21. Philadelphia (PA): AACR; Cancer Res 2021;81(13_Suppl):Abstract nr 255.

ABSTRACTArtificial surfaces represent one of the key land cover types, and validation is an indispensable component of land cover mapping that ensures data quality. Traditionally, validation has been carried out by confronting the produced land cover map with reference data, which is collected through field surveys or image interpretation. However, this approach has limitations, including high costs in terms of money and time. Recently, geo-tagged photos from social media have been used as reference data. This procedure has lower costs, but the process of interpreting geo-tagged photos is still time-consuming. In fact, social media point of interest (POI) data, including geo-tagged photos, may contain useful textual information for land cover validation. However, this kind of special textual data has seldom been analysed or used to support land cover validation. This paper examines the potential of textual information from social media POIs as a new reference source to assist in artificial surface validation without photo recognition and proposes a validation framework using modified decision trees. First, POI datasets are classified semantically to divide POIs into the standard taxonomy of land cover maps. Then, a decision tree model is built and trained to classify POIs automatically. To eliminate the effects of spatial heterogeneity on POI classification, the shortest distances between each POI and both roads and villages serve as two factors in the modified decision tree model. Finally, a data transformation based on a majority vote algorithm is then performed to convert the classified points into raster form for the purposes of applying confusion matrix methods to the land cover map. Using Beijing as a study area, social media POIs from Sina Weibo were collected to validate artificial surfaces in GlobeLand30 in 2010. A classification accuracy of 80.68% was achieved through our modified decision tree method. Compared with a classification method without spatial heterogeneity, the accuracy is 10% greater. This result indicates that our modified decision tree method displays considerable skill in classifying POIs with high spatial heterogeneity. In addition, a high validation accuracy of 92.76% was achieved, which is relatively close to the official result of 86.7%. These preliminary results indicate that social media POI datasets are valuable ancillary data for land cover validation, and our proposed validation framework provides opportunities for land cover validation with low costs in terms of money and time.

Majority Vote Algorithm Research Articles

Related Topics

Articles published on Majority Vote Algorithm

Classification using semantic feature and machine learning: Land-use case application

Abstract 255: TransVAF: A transfer learning approach for recognize genomic mutations with various tumor purity and clonal proportions

AUTOMATED CLASSIFICATION OF SOLITARY PULMONARY NODULES USING CONVOLUTIONAL NEURAL NETWORK BASED ON TRANSFER LEARNING STRATEGY

Majority Vote Cascading: A Semi-Supervised Framework for Improving Protein Function Prediction.

Intra-Annual Sentinel-2 Time-Series Supporting Grassland Habitat Discrimination

Find truth in the hands of the few: acquiring specific knowledge with crowdsourcing

Multi-scale deep feature learning network with bilateral filtering for SAR image classification

Atlas-based liver segmentation for nonhuman primate research.

Asynchronous Prediction of Human Gait Intention in a Pseudo Online Paradigm Using Wavelet Transform.

TOD-CUP: a gene expression rank-based majority vote algorithm for tissue origin diagnosis of cancers of unknown primary.

The dependence of the majority voting decision-making probabilities on a multi-expert binary system experts number

Towards Benthic Habitat 3D Mapping Using Machine Learning Algorithms and Structures from Motion Photogrammetry

Robust Classification of Intramuscular EMG Signals to Aid the Diagnosis of Neuromuscular Disorders.

Predicting Stock Market Trends using Hybrid SVM Model and LSTM with Sentiment Determination using Natural Language Processing

MOP: Predicting Multiple Output in Multi-Sharing System

The Design of the Majority Voting Algorithm Based on Search Engine for the Text Copyright Detection Crawler

Hyperspectral Image Classification in the Presence of Noisy Labels

Application of majority voting and consensus voting algorithms in N-version software

An Innovative Emotion Assessment using Physiological Signals Based on The Combination Mechanism

Exploring point-of-interest data from social media for artificial surface validation with decision trees

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Majority Vote Algorithm Research Articles

Related Topics

Articles published on Majority Vote Algorithm

Classification using semantic feature and machine learning: Land-use case application

Abstract 255: TransVAF: A transfer learning approach for recognize genomic mutations with various tumor purity and clonal proportions

AUTOMATED CLASSIFICATION OF SOLITARY PULMONARY NODULES USING CONVOLUTIONAL NEURAL NETWORK BASED ON TRANSFER LEARNING STRATEGY

Majority Vote Cascading: A Semi-Supervised Framework for Improving Protein Function Prediction.

Intra-Annual Sentinel-2 Time-Series Supporting Grassland Habitat Discrimination

Find truth in the hands of the few: acquiring specific knowledge with crowdsourcing

Multi-scale deep feature learning network with bilateral filtering for SAR image classification

Atlas-based liver segmentation for nonhuman primate research.

Asynchronous Prediction of Human Gait Intention in a Pseudo Online Paradigm Using Wavelet Transform.

TOD-CUP: a gene expression rank-based majority vote algorithm for tissue origin diagnosis of cancers of unknown primary.

The dependence of the majority voting decision-making probabilities on a multi-expert binary system experts number

Towards Benthic Habitat 3D Mapping Using Machine Learning Algorithms and Structures from Motion Photogrammetry

Robust Classification of Intramuscular EMG Signals to Aid the Diagnosis of Neuromuscular Disorders.

Predicting Stock Market Trends using Hybrid SVM Model and LSTM with Sentiment Determination using Natural Language Processing

MOP: Predicting Multiple Output in Multi-Sharing System

The Design of the Majority Voting Algorithm Based on Search Engine for the Text Copyright Detection Crawler

Hyperspectral Image Classification in the Presence of Noisy Labels

Application of majority voting and consensus voting algorithms in N-version software

An Innovative Emotion Assessment using Physiological Signals Based on The Combination Mechanism

Exploring point-of-interest data from social media for artificial surface validation with decision trees