Keyword Extraction Algorithm Research Articles

Existing methods have evolved from using synonym substitution to incorporating arbitrary word substitution to achieve reversible natural language watermarking. However, a notable limitation is that they are prone to overlook the sensitivity of information associated with the original words, with a tendency to prefer non-sensitive words for substitution. As a result, a potential risk of sensitive information leakage contained in the original text is posed. Furthermore, while aiming for reversibility, the overall performance of the watermarking method may be inadvertently compromised. In response to the above problems, this paper puts forward a novel reversible natural language watermarking method that combines a Keyword Substitution scheme and a Prediction Error Expansion algorithm (KSPEE) to protect sensitive information, verify content integrity, protect copyright, and so on. Specifically, KSPEE leverages a keyword extraction algorithm to identify important content containing sensitive information in the original text, thereby determining the potential positions for watermark information embedding. Subsequently, a masked language model is utilized to predict appropriate substitution words based on the surrounding semantic information of the embedding position. In addition, the prediction error expansion algorithm is employed to select appropriate words for substituting the original keywords, ensuring the successful embedding of watermark information while maintaining the recoverability of the original keywords. By identifying keywords and substituting them, a suitable method of protecting the original sensitive information is provided. Extensive experiments demonstrate that, under the promise of semantic distortion and lossless restoration of the original content, the proposed method KSPEE achieves outstanding watermarked text quality. A higher watermark embedding rate is achieved and strong security is shown by KSPEE. More importantly, KSPEE effectively prevents the leakage of sensitive information.

Read full abstract

Reported bugs of software systems are classified into different severity levels before fixing them. The number of bug reports may not be equally distributed according to the severity levels of bugs. However, most of the severity prediction models developed in the literature assumed that the underlying data distribution is evenly distributed, which may not correct at all instances and hence, the aim of this study is to develop bug classification models from unevenly distributed datasets and tested them accordingly. To that end first, the topics or keywords of developer descriptions of bug reports are extracted using Rapid Keyword Extraction (RAKE) algorithm and then transferred them into numerical attributes, which combined with severity levels constructs datasets. These datasets are used to build classification models; Naïve Bayes, Logistic Regression, and Decision Tree Learner algorithms. The models’ prediction quality is measured using Area Under Recursive Operative Characteristics Curves (AUC) as the models learnt from more skewed environments. According to the results, the prediction quality of the Logistics Regression model is 0.65 AUC whereas the other two models recorded maximum 0.60 AUC. Though the datasets contain comparatively less number of instances from the high severity classes; Blocking and High, the Logistic Regression models predict the two classes with a decent AUC value of 0.65 AUC. Hence, this projects shows that the models can be trained from highly skewed datasets so that the models prediction quality is equally well over all the classes regardless of number of instances representing the class. Further, this project emphasizes that the models should be evaluated using the appropriate metrics when the models are trained from imbalance learning environments. Also, this work uncovers that the Logistic Regression model is also capable of classifying documents as Naïve Bayes, which is well known for this task.

Read full abstract

Keyword Extraction Algorithm Research Articles

Related Topics

Articles published on Keyword Extraction Algorithm

A Graph-Based Keyword Extraction Method for Academic Literature Knowledge Graph Construction

A reversible natural language watermarking for sensitive information protection

An Accuracy Study of Personalized Recommendation System for E-commerce Based on Big Data Analysis

Development system for coordination of activities of experts in the formation of machineschetable standards in the field of military and space activities based on ontological engineering: a case study

Natural language processing (NLP) aided qualitative method in health research

Keyword Extraction-based Library Intelligence Services: Challenges, Adaptations and Reinvention

Automatic Keyword Extraction Algorithm for Chinese Text based on Word Clustering

Finding Experts in Community Question Answering System Using Trie String Matching Algorithm with Domain Knowledge

Retracted: Research on Keyword Extraction Algorithm in English Text Based on Cluster Analysis.

Retracted: TextRank Keyword Extraction Algorithm Using Word Vector Clustering Based on Rough Data-Deduction.

Research on keyword extraction based on abstract extraction

A New Text-Mining–Bayesian Network Approach for Identifying Chemical Safety Risk Factors

Research on Keyword Extraction Algorithm in English Text Based on Cluster Analysis.

Visual analytics and information extraction of geological content for text-based mineral exploration reports

TextRank Keyword Extraction Algorithm Using Word Vector Clustering Based on Rough Data-Deduction.

Multifeature Fusion Keyword Extraction Algorithm Based on TextRank

News keyword extraction algorithm based on semantic clustering and word graph model

Keyword Extraction Algorithm for Classifying Smoking Status from Unstructured Bilingual Electronic Health Records Based on Natural Language Processing

Bug Severity Prediction using Keywords in Imbalanced Learning Environment

Keyword extraction method for machine reading comprehension based on natural language processing

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Keyword Extraction Algorithm Research Articles

Related Topics

Articles published on Keyword Extraction Algorithm

A Graph-Based Keyword Extraction Method for Academic Literature Knowledge Graph Construction

A reversible natural language watermarking for sensitive information protection

An Accuracy Study of Personalized Recommendation System for E-commerce Based on Big Data Analysis

Development system for coordination of activities of experts in the formation of machineschetable standards in the field of military and space activities based on ontological engineering: a case study

Natural language processing (NLP) aided qualitative method in health research

Keyword Extraction-based Library Intelligence Services: Challenges, Adaptations and Reinvention

Automatic Keyword Extraction Algorithm for Chinese Text based on Word Clustering

Finding Experts in Community Question Answering System Using Trie String Matching Algorithm with Domain Knowledge

Retracted: Research on Keyword Extraction Algorithm in English Text Based on Cluster Analysis.

Retracted: TextRank Keyword Extraction Algorithm Using Word Vector Clustering Based on Rough Data-Deduction.

Research on keyword extraction based on abstract extraction

A New Text-Mining–Bayesian Network Approach for Identifying Chemical Safety Risk Factors

Research on Keyword Extraction Algorithm in English Text Based on Cluster Analysis.

Visual analytics and information extraction of geological content for text-based mineral exploration reports

TextRank Keyword Extraction Algorithm Using Word Vector Clustering Based on Rough Data-Deduction.

Multifeature Fusion Keyword Extraction Algorithm Based on TextRank

News keyword extraction algorithm based on semantic clustering and word graph model

Keyword Extraction Algorithm for Classifying Smoking Status from Unstructured Bilingual Electronic Health Records Based on Natural Language Processing

Bug Severity Prediction using Keywords in Imbalanced Learning Environment

Keyword extraction method for machine reading comprehension based on natural language processing