Unsupervised Learning Strategy Research Articles

In the field of cybersecurity, the ability to compute similarity scores at the function level for binary code is of utmost importance. Considering that a single binary file may contain an extensive amount of functions, an effective learning framework must exhibit both high accuracy and efficiency when handling substantial volumes of data. Nonetheless, conventional methods encounter several limitations. Firstly, accurately annotating different pairs of functions with appropriate labels poses a significant challenge, thereby making it difficult to employ supervised learning methods without risk of overtraining. Secondly, while SOTA models often rely on pre-trained encoders or fine-grained graph comparison techniques, these approaches suffer from drawbacks related to time and memory consumption. Thirdly, the momentum update algorithm utilized in graph-based contrastive learning models can result in information leakage. Surprisingly, none of the existing articles address this issue. This research focuses on addressing the challenges associated with large-scale Binary Code Similarity Detection (BCSD). To overcome the aforementioned problems, we propose GraphMoCo: a graph momentum contrast model that leverages multimodal structural information for efficient binary function representation learning on a large scale. We adopt an unsupervised learning strategy. Our approach eliminates the need for manual labeling. By leveraging the intrinsic structural information at multiple levels of the binary code, our model could achieve higher accuracy with a simple CNN-based model. By introducing the preshuffle mechanism, the issue of information leakage in graph momentum update algorithm is mitigated. The evaluation results indicate that GraphMoCo exhibits superior performance compared to SOTA approaches in the function pair search task, showing an average improvement of 7% on AUC, and 10% on MRR and Recall@1. Furthermore, GraphMoCo achieves a MAP of 0.93 on the more challenging dataset 2, which comprises a larger function pool. In a real-world scenario, specifically in known vulnerability searching, GraphMoCo achieves a MRR that surpasses existing SOTA models by 5%.

Read full abstract

Constructing an accurate classification model is the key to realizing maize seed varieties identification based on optical sensing techniques. However, the optical information (features) of seeds collected by optical sensing technology is easily affected by the planting environment (e.g., origin, year), which makes the training samples used to construct the model (source domain, SD) and the samples to be recognized (target domain, TD) subject to domain shift (DS), and ultimately undermines the recognition performance of the model. In this study, a modeling scheme, called unsupervised domain adversarial tri-training of neural networks (UDATNN), is proposed for maize variety recognition in the presence of DS. Firstly, an unsupervised domain adversarial learning approach is used to map the raw features of the SD and TD into a low-dimensional feature space in order to achieve feature alignment of the SD and TD, and to improve the feature discriminability of different varieties of seeds. Subsequently, the aligned low-dimensional features are used as inputs of classifiers (random forest) and part of the target domain samples are iteratively selected and given pseudo-labels according to the tri-training strategy. Finally, these samples assigned with pseudo-labels (called updating samples) together with the training samples in SD are used to re-train the classification model constructed based on the unsupervised domain adversarial strategy, for improving recognition accuracy of seed varieties with DS scenario. Hyperspectral images of a total of 4584 maize seeds, including 7 varieties (each variety produced from two or three years) were collected to verify the performance of the proposed scheme. The recognition accuracy of the target domain reaches 91.5%, 94.1%, and 92.6% under 3 different domain shifts, which improves the recognition performance nearly by 8%–22% compared with the no-transfer model and the traditional transfer model. The model performance was further discussed by calculating precision, recall, and F1 score to achieve satisfactory results. The robustness of the model was also verified by discussing the randomness of the update samples and the effect of the number of samples in the source domain on the model performance through 100 random experiments and multiple experimental comparisons. The proposed UDATNN scheme can be used as a new framework to address the nondestructive identification of seed varieties under domain shift conditions.

Read full abstract

Unsupervised Learning Strategy Research Articles

Related Topics

Articles published on Unsupervised Learning Strategy

Enhanced landslide susceptibility mapping in data-scarce regions via unsupervised few-shot learning

Fed-Evolver: An automated evolving approach for federated Intrusion Detection System using adversarial autoencoder in SDN-enabled networks

Intelligent computing framework to analyze the transmission risk of COVID-19: Meyer wavelet artificial neural networks

A defect classification algorithm for gas tungsten arc welding process based on unsupervised learning and few-shot learning strategy

The Use of eXplainable Artificial Intelligence and Machine Learning Operation Principles to Support the Continuous Development of Machine Learning-Based Solutions in Fault Detection and Identification

HeGCL: Advance Self-Supervised Learning in Heterogeneous Graph-Level Representation.

Artificial intelligence and machine learning in optics: tutorial

Finding a closest saddle–node bifurcation in power systems: An approach by unsupervised deep learning

Gas well production optimization: Classifying liquid loading severity in shale gas wells using semi-supervised learning

Temporally-preserving latent variable models: Offline and online training for reconstruction and interpretation of fault data for gearbox condition monitoring

LELD: Learn enhancement by learning degradation

Ensemble Approach Using k-Partitioned Isolation Forests for the Detection of Stock Market Manipulation

Contrastive deep convolutional transform k-means clustering

GraphMoCo: A graph momentum contrast model for large-scale binary function representation learning

Hybrid Unsupervised Learning Strategy for Monitoring Industrial Batch Processes

Combining Hough Transform and Fuzzy Unsupervised Learning Strategy in Automatic Segmentation of Large Bowel Obstruction Area from Erect Abdominal Radiographs

Statistical similarity matching and filtering for clinical image retrieval by machine learning approach

LIDA‐YOLO: An unsupervised low‐illumination object detection based on domain adaptation

UDATNN: A modeling scheme integrating unsupervised domain adversarial learning and tri-training strategy for variety recognition of maize seeds with domain shift

DEAR-GAN: Degradation-Aware Face Restoration With GAN Prior

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Unsupervised Learning Strategy Research Articles

Related Topics

Articles published on Unsupervised Learning Strategy

Enhanced landslide susceptibility mapping in data-scarce regions via unsupervised few-shot learning

Fed-Evolver: An automated evolving approach for federated Intrusion Detection System using adversarial autoencoder in SDN-enabled networks

Intelligent computing framework to analyze the transmission risk of COVID-19: Meyer wavelet artificial neural networks

A defect classification algorithm for gas tungsten arc welding process based on unsupervised learning and few-shot learning strategy

The Use of eXplainable Artificial Intelligence and Machine Learning Operation Principles to Support the Continuous Development of Machine Learning-Based Solutions in Fault Detection and Identification

HeGCL: Advance Self-Supervised Learning in Heterogeneous Graph-Level Representation.

Artificial intelligence and machine learning in optics: tutorial

Finding a closest saddle–node bifurcation in power systems: An approach by unsupervised deep learning

Gas well production optimization: Classifying liquid loading severity in shale gas wells using semi-supervised learning

Temporally-preserving latent variable models: Offline and online training for reconstruction and interpretation of fault data for gearbox condition monitoring

LELD: Learn enhancement by learning degradation

Ensemble Approach Using k-Partitioned Isolation Forests for the Detection of Stock Market Manipulation

Contrastive deep convolutional transform k-means clustering

GraphMoCo: A graph momentum contrast model for large-scale binary function representation learning

Hybrid Unsupervised Learning Strategy for Monitoring Industrial Batch Processes

Combining Hough Transform and Fuzzy Unsupervised Learning Strategy in Automatic Segmentation of Large Bowel Obstruction Area from Erect Abdominal Radiographs

Statistical similarity matching and filtering for clinical image retrieval by machine learning approach

LIDA‐YOLO: An unsupervised low‐illumination object detection based on domain adaptation

UDATNN: A modeling scheme integrating unsupervised domain adversarial learning and tri-training strategy for variety recognition of maize seeds with domain shift

DEAR-GAN: Degradation-Aware Face Restoration With GAN Prior