A Commit Classification Framework Incorporated With Prompt Tuning and External Knowledge

  • Abstract
  • Literature Map
  • Similar Papers
Abstract
Translate article icon Translate Article Star icon
Take notes icon Take Notes

Commit classification is an important task in software maintenance, since it helps software developers classify code changes into different types according to their nature and purpose. This allows them to better understand how their development efforts are progressing, identify areas where they need improvement, and make informed decisions about when and how to release new versions of their software. However, existing methods are all discriminative models, usually with complex architectures that require additional output layers to produce class label probabilities, making them task‐specific and unable to learn features across different tasks. Moreover, they require a large amount of labeled data for fine tuning, and it is difficult to learn effective classification boundaries in the case of limited labeled data. To solve the above problems, we propose a generative framework that incorporates prompt tuning for commit classification with external knowledge (IPCK), which simplifies the model structure and learns features across different tasks, only based on the commit message information as the input. First, we proposed a generative framework based on T5 (text‐to‐text transfer transformer). This encoder–decoder construction method unifies different commit classification tasks into a text‐to‐text problem, simplifying the model’s structure by not requiring an extra output layer. Second, instead of fine tuning, we design a prompt tuning solution that can be adopted in few‐shot scenarios with only limited samples. Furthermore, we incorporate external knowledge via an external knowledge graph to map the probabilities of words into the final labels in the speech machine step to improve performance in few‐shot scenarios. Extensive experiments on two open available datasets demonstrate that our framework can solve the commit classification problem simply but effectively for both single‐label binary classification and single‐label multiclass classification purposes with 90% and 83% accuracy. Further, in the few‐shot scenarios, our method improves the adaptability of the model without requiring a large number of training samples for fine tuning.

Similar Papers
  • Research Article
  • Cite Count Icon 31
  • 10.1080/2150704x.2018.1557787
Semi-supervised deep learning for hyperspectral image classification
  • Jan 3, 2019
  • Remote Sensing Letters
  • Xudong Kang + 2 more

ABSTRACTRecently, a series of deep learning methods based on the convolutional neural networks (CNNs) have been introduced for classification of hyperspectral images (HSIs). However, in order to obtain the optimal parameters, a large number of training samples are required in the CNNs to avoid the overfitting problem. In this paper, a novel method is proposed to extend the training set for deep learning based hyperspectral image classification. First, given a small-sample-size training set, the principal component analysis based edge-preserving features (PCA-EPFs) and extended morphological attribute profiles (EMAPs) are used for HSI classification so as to generate classification probability maps. Second, a large number of pseudo training samples are obtained by the designed decision function which depends on the classification probabilities. Finally, a deep feature fusion network (DFFN) is applied to classify HSI with the training set consists of the original small-sample-size training set and the added pseudo training samples. Experiments performed on several hyperspectral data sets demonstrate the state-of-the-art performance of the proposed method in terms of classification accuracies.

  • Book Chapter
  • Cite Count Icon 1
  • 10.1007/978-3-030-03338-5_6
Research on the Method of Tibetan Recognition Based on Component Location Information
  • Jan 1, 2018
  • Yuehui Han + 3 more

The recognition of Tibetan is of great significance to the study of Tibetan culture while the progress of Tibetan character recognition is lagging behind. Especially when there are not a large number of available training samples, Tibetan character recognition is very difficult. So we propose a recognition method for Tibetan characters based on component location information without a large number of training samples. The proposed method includes three main parts: (1) The segmentation of character and the extraction of component which contain location information in the character; (2) Features extraction and classifier design; (3) The superposition of component after recognition and the retrieval of character. The testing results are: the recognition rate of single component is 98.4%, the recognition rate of multilevel component is 97.2%. It indicates that the method has a good effect on the recognition of Tibetan character, and it is helpful for the recognition of Tibetan documents.

  • Research Article
  • Cite Count Icon 24
  • 10.1007/s11760-016-1035-x
Visual object tracking with online sample selection via lasso regularization
  • Jan 11, 2017
  • Signal, Image and Video Processing
  • Qiao Liu + 3 more

In the past years, discriminative methods are popular in visual tracking. The main idea of the discriminative method is to learn a classifier to distinguish the target from the background. The key step is the update of the classifier. Usually, the tracked results are chosen as the positive samples to update the classifier, which results in the failure of the updating of the classifier when the tracked results are not accurate. After that the tracker will drift away from the target. Additionally, a large number of training samples would hinder the online updating of the classifier without an appropriate sample selection strategy. To address the drift problem, we propose a score function to predict the optimal candidate directly instead of learning a classifier. Furthermore, to solve the problem of a large number of training samples, we design a sparsity-constrained sample selection strategy to choose some representative support samples from the large number of training samples on the updating stage. To evaluate the effectiveness and robustness of the proposed method, we implement experiments on the object tracking benchmark and 12 challenging sequences. The experiment results demonstrate that our approach achieves promising performance.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 1
  • 10.3390/rs15184432
A Weak Sample Optimisation Method for Building Classification in a Semi-Supervised Deep Learning Framework
  • Sep 8, 2023
  • Remote Sensing
  • Yanjun Wang + 5 more

Deep learning has gained widespread interest in the task of building semantic segmentation modelling using remote sensing images; however, neural network models require a large number of training samples to achieve better classification performance, and the models are more sensitive to error patches in the training samples. The training samples obtained in semi-supervised classification methods need less reliable weakly labelled samples, but current semi-supervised classification research puts the generated weak samples directly into the model for applications, with less consideration of the impact of the accuracy and quality improvement of the weak samples on the subsequent model classification. Therefore, to address the problem of generating and optimising the quality of weak samples from training data in deep learning, this paper proposes a semi-supervised building classification framework. Firstly, based on the test results of the remote sensing image segmentation model and the unsupervised classification results of LiDAR point cloud data, this paper quickly generates weak image samples of buildings. Secondly, in order to improve the quality of the spots of the weak samples, an iterative optimisation strategy of the weak samples is proposed to compare and analyse the weak samples with the real samples and extract the accurate samples from the weak samples. Finally, the real samples, the weak samples, and the optimised weak samples are input into the semantic segmentation model of buildings for accuracy evaluation and analysis. The effectiveness of this paper’s approach was experimentally verified on two different building datasets, and the optimised weak samples improved by 1.9% and 0.6%, respectively, in the test accuracy mIoU compared to the initial weak samples. The results demonstrate that the semi-supervised classification framework proposed in this paper can be used to alleviate the model’s demand for a large number of real-labelled samples while improving the ability to utilise weak samples, and it can be used as an alternative to fully supervised classification methods in deep learning model applications that require a large number of training samples.

  • Book Chapter
  • Cite Count Icon 2
  • 10.1007/978-3-031-19433-7_9
Context-Driven Visual Object Recognition Based on Knowledge Graphs
  • Jan 1, 2022
  • Sebastian Monka + 2 more

Current deep learning methods for object recognition are purely data-driven and require a large number of training samples to achieve good results. Due to their sole dependence on image data, these methods tend to fail when confronted with new environments where even small deviations occur. Human perception, however, has proven to be significantly more robust to such distribution shifts. It is assumed that their ability to deal with unknown scenarios is based on extensive incorporation of contextual knowledge. Context can be based either on object co-occurrences in a scene or on memory of experience. In accordance with the human visual cortex which uses context to form different object representations for a seen image, we propose an approach that enhances deep learning methods by using external contextual knowledge encoded in a knowledge graph. Therefore, we extract different contextual views from a generic knowledge graph, transform the views into vector space and infuse it into a DNN. We conduct a series of experiments to investigate the impact of different contextual views on the learned object representations for the same image dataset. The experimental results provide evidence that the contextual views influence the image representations in the DNN differently and therefore lead to different predictions for the same images. We also show that context helps to strengthen the robustness of object recognition models for out-of-distribution images, usually occurring in transfer learning tasks or real-world scenarios.KeywordsNeuro-symbolicKnowledge graphContextual learning

  • Research Article
  • Cite Count Icon 91
  • 10.1016/j.apgeochem.2021.104994
Detection of the multivariate geochemical anomalies associated with mineralization using a deep convolutional neural network and a pixel-pair feature method
  • May 19, 2021
  • Applied Geochemistry
  • Chunjie Zhang + 2 more

Detection of the multivariate geochemical anomalies associated with mineralization using a deep convolutional neural network and a pixel-pair feature method

  • Conference Article
  • Cite Count Icon 6
  • 10.1109/iciea48937.2020.9248182
Transfer Learning in General Lensless Imaging through Scattering Media
  • Nov 9, 2020
  • Yukuan Yang + 6 more

Recently deep neural networks (DNNs) have been successfully introduced to the field of lensless imaging through scattering media. By solving an inverse problem in computational imaging, DNNs can overcome several shortcomings in the conventional lensless imaging through scattering media methods, namely, high cost, poor quality, complex control, and poor anti-interference. However, for training, a large number of training samples on various datasets have to be collected, with a DNN trained on one dataset generally performing poorly for recovering images from another dataset. The underlying reason is that lensless imaging through scattering media is a high dimensional regression problem and it is difficult to obtain an analytical solution. In this work, transfer learning is proposed to address this issue. Our main idea is to train a DNN on a relatively complex dataset using a large number of training samples and fine-tune the last few layers using very few samples from other datasets. Instead of the thousands of samples required to train from scratch, transfer learning alleviates the problem of costly data acquisition. Specifically, considering the difference in sample sizes and similarity among datasets, we propose two DNN architectures, namely LISMU-FCN and LISMU-OCN, and a balance loss function designed for balancing smoothness and sharpness. LISMU-FCN, with much fewer parameters, can achieve imaging across similar datasets while LISMU-OCN can achieve imaging across significantly different datasets. What's more, we establish a set of simulation algorithms that are close to the real experiments, and it is of great significance and practical value in the research on lensless scattering imaging. In summary, this work provides a new solution for lensless imaging through scattering media using transfer learning in DNNs.

  • Research Article
  • Cite Count Icon 82
  • 10.1007/s41651-019-0039-9
Scene Classification of High-Resolution Remotely Sensed Image Based on ResNet
  • Oct 10, 2019
  • Journal of Geovisualization and Spatial Analysis
  • Mingchang Wang + 4 more

Remote sensing technology for earth observation is becoming increasingly important with advances in economic growth, rapid social development and the many factors accompanying economic development. High spatial resolution remote sensing images come with distinct layers, clear texture and rich spatial information, and have broad areas of application. Deep learning models have the ability to acquire the depth features contained in images but they usually require a large number of training samples. In this study, we propose a method to realize scene level classification of high spatial resolution images when a large number of training samples cannot be provided. We extracted the depth features of high-resolution remote sensing images using a residual learning network (ResNet), and low-level features, including color moment features and gray level co-occurrence matrix features. We used these to construct various scenes semantic features of high-resolution images, and created a classification model with the training support vector machine (SVM). According to the sample migration method, with the UC Merced Land Use (UCM) data set as the migration sample, a scene classification accuracy of GF-2 data set can reach 95.71% with a small sample size. Finally, through this method, GF-2 image scene level classification is implemented in line with reality.

  • Conference Article
  • Cite Count Icon 30
  • 10.1109/icpr.1990.118138
Small sample size effects in statistical pattern recognition: recommendations for practitioners and open problems
  • Jun 16, 1990
  • S.J Raudys + 1 more

The authors discuss the effects of sample size on the feature selection and error estimation for several types of classifiers. In addition to surveying prior work in this area, they give practical advice to today's designers and users of statistical pattern recognition systems. It is pointed out that one needs a large number of training samples if a complex classification rule with many features is being utilized. In many pattern recognition problems, the number of potential features is very large and not much is known about the characteristics of the pattern classes under consideration: thus, it is difficult to determine a priori the complexity of the classification rule needed. Therefore, even when the designer believes that a large number of training samples has been selected, they may not be enough for designing and evaluating the classification problem at hand. It is further noted that a small sample size can cause many problems in the design of a pattern recognition system. >

  • Research Article
  • Cite Count Icon 80
  • 10.1016/j.bspc.2015.09.001
Spectral Collaborative Representation based Classification for hand gestures recognition on electromyography signals
  • Sep 16, 2015
  • Biomedical Signal Processing and Control
  • Ali Boyali + 1 more

Spectral Collaborative Representation based Classification for hand gestures recognition on electromyography signals

  • Conference Article
  • Cite Count Icon 4
  • 10.1109/cisp.2014.7003879
Maximum classification optimization-based active learning for image classification
  • Oct 1, 2014
  • Zhengwei Cui + 4 more

Traditional multi-class image classification needs a large number of training samples for building a classifier model. However, it is very time-consuming and costly to obtain labels for a large number of training samples from human experts. Active learning is a feasible solution. This paper proposes a maximum classification optimization method (MCO) for actively selecting unlabeled images to acquire labels. It integrated the information of an unlabeled sample from different perspectives with two steps. It first chooses a subset of candidates, and then selects the best from these candidates. Our experimental results show that the maximum classification optimization method outperforms two popular exiting methods (entropy-based uncertainty and BvSB).

  • Book Chapter
  • Cite Count Icon 2
  • 10.3233/shti230099
Post Hoc Sample Size Estimation for Deep Learning Architectures for ECG-Classification
  • May 18, 2023
  • Lucas Bickmann + 2 more

Deep Learning architectures for time series require a large number of training samples, however traditional sample size estimation for sufficient model performance is not applicable for machine learning, especially in the field of electrocardiograms (ECGs). This paper outlines a sample size estimation strategy for binary classification problems on ECGs using different deep learning architectures and the large publicly available PTB-XL dataset, which includes 21801 ECG samples. This work evaluates binary classification tasks for Myocardial Infarction (MI), Conduction Disturbance (CD), ST/T Change (STTC), and Sex. All estimations are benchmarked across different architectures, including XResNet, Inception-, XceptionTime and a fully convolutional network (FCN). The results indicate trends for required sample sizes for given tasks and architectures, which can be used as orientation for future ECG studies or feasibility aspects.

  • Conference Article
  • Cite Count Icon 47
  • 10.1145/3238147.3238219
ClDiff: generating concise linked code differences
  • Sep 3, 2018
  • Kaifeng Huang + 6 more

Analyzing and understanding source code changes is important in a variety of software maintenance tasks. To this end, many code differencing and code change summarization methods have been proposed. For some tasks (e.g. code review and software merging), however, those differencing methods generate too fine-grained a representation of code changes, and those summarization methods generate too coarse-grained a representation of code changes. Moreover, they do not consider the relationships among code changes. Therefore, the generated differences or summaries make it not easy to analyze and understand code changes in some software maintenance tasks. In this paper, we propose a code differencing approach, named CLDIFF, to generate concise linked code differences whose granularity is in between the existing code differencing and code change summarization methods. The goal of CLDIFF is to generate more easily understandable code differences. CLDIFF takes source code files before and after changes as inputs, and consists of three steps. First, it pre-processes the source code files by pruning unchanged declara- tions from the parsed abstract syntax trees. Second, it generates concise code differences by grouping fine-grained code differences at or above the statement level and describing high-level changes in each group. Third, it links the related concise code differences according to five pre-defined links. Experiments with 12 Java projects (74,387 commits) and a human study with 10 participants have indicated the accuracy, conciseness, performance and usefulness of CLDIFF.

  • Research Article
  • Cite Count Icon 31
  • 10.1007/s11432-014-0372-y
Mining authorship characteristics in bug repositories
  • Nov 23, 2016
  • Science China Information Sciences
  • He Jiang + 4 more

Bug reports are widely employed to facilitate software tasks in software maintenance. Since bug reports are contributed by people, the authorship characteristics of contributors may heavily impact the performance of resolving software tasks. Poorly written bug reports may delay developers when fixing bugs. However, no in-depth investigation has been conducted over the authorship characteristics. In this study, we first leverage byte-level $N$-grams to model the authorship characteristics and employ Normalized Simplified Profile Intersection (NSPI) to identify the similarity of the authorship characteristics. Then, we investigate a series of properties related to contributors authorship characteristics, including the evolvement over time and the variation among distinct products in open source projects. Moreover, we show how to leverage the authorship characteristics to facilitate a well-known task in software maintenance, namely Bug Report Summarization (BRS). Experiments on open source projects validate that incorporating the authorship characteristics can effectively improve a state-of-the-art method in BRS. Our findings suggest that contributors should retain stable authorship characteristics and the authorship characteristics can assist in resolving software tasks.

  • Research Article
  • Cite Count Icon 6
  • 10.32620/reks.2022.3.12
A novel approach for semantic segmentation of automatic road network extractions from remote sensing images by modified UNet
  • Oct 4, 2022
  • RADIOELECTRONIC AND COMPUTER SYSTEMS
  • Miral J Patel + 2 more

Accurate and up-to-date road maps are crucial for numerous applications such as urban planning, automatic vehicle navigation systems, and traffic monitoring systems. However, even in the high resolutions remote sensing images, the background and roads look similar due to the occlusion of trees and buildings, and it is difficult to accurately segment the road network from complex background images. In this research paper, an algorithm based on deep learning was proposed to segment road networks from remote sensing images. This semantic segmentation algorithm was developed with a modified UNet. Because of the lower availability of remote sensing images for semantic segmentation, the data augmentation method was used. Initially, the semantic segmentation network was trained by a large number of training samples using traditional UNet architecture. After then, the number of training samples is reduced gradually, and measures the performance of a traditional UNet model. This basic UNet model gives better results in the form of accuracy, IOU, DICE score, and visualization of the image for the 362 training samples. The idea here is to simply extract road data from remote sensing images. As a result, unlike traditional UNet, there is no need for a deeper neural network encoder-decoder structure. Hence, the number of convolutional layers in the modified UNet is lower than that in the standard UNet. Therefore, the complexity of the deep learning architecture and the training time required by the road network model was reduced. The model performance measured by the intersection over union (IOU) was 93.71% and the average segmentation time of a single image was 0.28 sec. The results showed that the modified UNet could efficiently segment road networks from remote sensing images with identical backgrounds. It can be used under various situations.

Save Icon
Up Arrow
Open/Close
  • Ask R Discovery Star icon
  • Chat PDF Star icon

AI summaries and top papers from 250M+ research sources.