Continuous Bag Of Words Research Articles

The exponential rise in advanced software computing and low-cost hardware has broadened the horizon for the Internet of Medical Things (IoMT), interoperable e-Healthcare systems serving varied purposes including electronic healthcare records (EHRs) and telemedicine. However, being heterogeneous and dynamic in nature, their database security remains a challenge forever. Numerous intrusion attacks including bot-attack and malware have confined major classical databases towards e-Healthcare. Despite the robustness of NoSQL over the structured query language databases, the dynamic data nature over a heterogeneous environment makes it vulnerable to intrusion attacks, especially over interoperable e-Healthcare systems. Considering these challenges, this work proposed a first of its kind semantic feature-driven NoSQL intrusion attack (NoSQL-IA) detection model for interoperable e-Healthcare systems. This work assessed the efficacy of the different semantic feature-extraction methods like Word2Vec, Continuous Bag of Words, N-Skip Gram (SKG), Count Vectorizer, TF-IDF, and GLOVE towards NoSQL-IA prediction. Subsequently, to minimize computational exhaustion, different feature selection methods including Wilcoxon Rank Sum Test (WRST), significant predictor test, principal component analysis, Select K-Best, and variance threshold feature selection algorithms were employed. To alleviate the data imbalance problem, it applied different resampling methods including upsampling, downsampling, and synthetic minority oversampling technique (SMOTE) over the selected features. Later, Min–Max normalization was performed over the input feature vectors to alleviate any possibility of overfitting. Towards NoSQL-IA prediction, different machine learning methods like Multinomial Naïve Bayes, decision tree, logistic regression, support vector machine, k-NN, AdaBoost, Extra Tree Classifier, random forest ensemble, and XG-Boost were applied, which classified each input query as the regular query or the NoSQL-IA attack query. The depth performance assessment revealed that the use of Word2Vec features SKG in sync with VTFS feature selection and SMOTE resampling processed with the bootstrapped random forest classifier can provide the best performance in terms of high accuracy (98.86%), F-Measure (0.974), and area under the curve (AUC) (0.981), thus enabling it suitable for interoperable e-Healthcare database security.

Read full abstract

Structured query language (SQL) has emerged as one of the most used databases, serving an array of Internet-of-Things (IoTs)-enabled services including web-transactions, grid networks, industrial activity log and proactive decision systems, smart-home, financial transactions, business communication etc. With high pace increase in SQL-driven IoT applications, the threat of SQL-injection attacks (SQLIAs) at the middleware layer has increased significantly. To address such issues, machine learning-based SQLIA-prediction systems are proposed; however, majority of the existing methods are found limited in terms of intrusion detection accuracy because of their complete-reliance on structural features and inferior learning model(s). On the contrary, intruders these days intrude the system by mimicking the normal queries and hence confuses most of the classical learning-based methods. To alleviate such problems, this article emphasizes on exploiting semantic features along with the state-of-art highly robust computing environment. We proposed a robust semantic query-featured ensemble learning model for SQLIA prediction. Unlike classical (query's) template-matching or term-assessment-based methods, our proposed SQLIA-prediction model exploits latent semantic features from large SQL-queries to train an ensemble learning model that classifies each query as the normal query or the SQLIA query. Functionally, it performs preprocessing over large set of SQL-queries using count-vectorizer and stopping word removal. Subsequently, it applies Word2Vec feature extraction method over each query using continuous bag of words (CBOW) and N-skip gram (SKG) algorithms, which obtained CBOW and SKG semantic features from each SQL-query. The extracted features were processed for data resampling so as to alleviate the problem of class-imbalance and skewness. To alleviate redundant computation, two feature selection algorithms named Mann-Whitney significance predictor test and principal component analysis were applied over the resampled features. Moreover, to eliminate over-fitting and convergence problem, Min-Max normalization was performed over the selected features which were later processed for learning using a state-of-art robust heterogeneous ensemble learning model. Unlike standalone classifier-based SQLIA, the proposed learning-model employed a set of nine base classifiers designed to serve maximum voting ensemble-based prediction. The proposed ensemble-learning method classified each SQL-query as the normal-query or the SQLIA-query. Simulation results affirmed superiority of the proposed SQLIA prediction model in terms of accuracy (98%), F-Score (0.989), AUC (0.999) signifying its efficacy toward real-world SQL-driven IoT-ecosystems.

Read full abstract

Continuous Bag Of Words Research Articles

Related Topics

Articles published on Continuous Bag Of Words

Enhanced Arabic Sentiment Analysis Using a Novel Stacking Ensemble of Hybrid and Deep Learning Models

Contextual Word2Vec Model for Understanding Chinese Out of Vocabularies on Online Social Media

A Dynamic Strategy for Classifying Sentiment From Bengali Text by Utilizing Word2vector Model

SFN: A Novel Scalable Feature Network for Vulnerability Representation of Open-Source Codes.

Refining Word Embeddings with Sentiment Information for Sentiment Analysis

Military Chain: Construction of Domain Knowledge Graph of Kill Chain Based on Natural Language Model

Augmented language model with deep learning adaptation on sentiment analysis for E-learning recommendation

Dynamic Data Infrastructure Security for Interoperable e-Healthcare Systems: A Semantic Feature-Driven NoSQL Intrusion Attack Detection Model.

Semantic Query-Featured Ensemble Learning Model for SQL-Injection Attack Detection in IoT-Ecosystems

A Hybrid Approach for Network Rumor Detection Based on Attention Mechanism and Bidirectional GRU Model in Big Data Environment

Sentiment analysis using global vector and long short-term memory

Deep Sentiment Analysis Using CNN-LSTM Architecture of English and Roman Urdu Text Shared in Social Media

Seeds: Sampling-Enhanced Embeddings.

Ensemble Classifiers for Arabic Sentiment Analysis of Social Network (Twitter Data) towards COVID-19-Related Conspiracy Theories

Abstractive Arabic Text Summarization Based on Deep Learning.

Bengali paper classification using ensemble machine learning algorithms

A Simple and Effective Usage of Word Clusters for CBOW Model

A hybrid E-learning recommendation integrating adaptive profiling and sentiment analysis

Reusable Component Retrieval from a Large Repository Using Word2Vec with Continuous Bag of Words

Applying Sentiment Product Reviews and Visualization for BI Systems in Vietnamese E-Commerce Website: Focusing on Vietnamese Context

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Continuous Bag Of Words Research Articles

Related Topics

Articles published on Continuous Bag Of Words

Enhanced Arabic Sentiment Analysis Using a Novel Stacking Ensemble of Hybrid and Deep Learning Models

Contextual Word2Vec Model for Understanding Chinese Out of Vocabularies on Online Social Media

A Dynamic Strategy for Classifying Sentiment From Bengali Text by Utilizing Word2vector Model

SFN: A Novel Scalable Feature Network for Vulnerability Representation of Open-Source Codes.

Refining Word Embeddings with Sentiment Information for Sentiment Analysis

Military Chain: Construction of Domain Knowledge Graph of Kill Chain Based on Natural Language Model

Augmented language model with deep learning adaptation on sentiment analysis for E-learning recommendation

Dynamic Data Infrastructure Security for Interoperable e-Healthcare Systems: A Semantic Feature-Driven NoSQL Intrusion Attack Detection Model.

Semantic Query-Featured Ensemble Learning Model for SQL-Injection Attack Detection in IoT-Ecosystems

A Hybrid Approach for Network Rumor Detection Based on Attention Mechanism and Bidirectional GRU Model in Big Data Environment

Sentiment analysis using global vector and long short-term memory

Deep Sentiment Analysis Using CNN-LSTM Architecture of English and Roman Urdu Text Shared in Social Media

Seeds: Sampling-Enhanced Embeddings.

Ensemble Classifiers for Arabic Sentiment Analysis of Social Network (Twitter Data) towards COVID-19-Related Conspiracy Theories

Abstractive Arabic Text Summarization Based on Deep Learning.

Bengali paper classification using ensemble machine learning algorithms

A Simple and Effective Usage of Word Clusters for CBOW Model

A hybrid E-learning recommendation integrating adaptive profiling and sentiment analysis

Reusable Component Retrieval from a Large Repository Using Word2Vec with Continuous Bag of Words

Applying Sentiment Product Reviews and Visualization for BI Systems in Vietnamese E-Commerce Website: Focusing on Vietnamese Context