CNN-GRU for Drowsiness Detection from Electrocardiogram Signal

  • Abstract
  • Literature Map
  • Similar Papers
Abstract
Translate article icon Translate Article Star icon
Take notes icon Take Notes

Drowsiness is a problem that needs to be addressed to improve road safety. To minimize this safety issue, driving-monitoring systems have been implemented in current car models, and electrocardiography (ECG) is one of the most commonly used driving monitoring techniques. ECG data are modeled using a deep neural network, including a Bidirectional Gated Recurrent Unit (Bi-GRU). However, the accuracy for classifying Wake-Sleep is under 80% and Wake-NREM-REM reaches less than 68%. To address this issue, ECG data from the MESA and SHHS datasets are modeled using a combination of a Convolutional Neural Network (CNN) and a Bi-GRU, referred to as CNN-GRU. This model incorporated Batch Normalization and RMSProp to achieve improved accuracy in classifying drivers' conditions. It operates in two computing sectors: cloud computing (Google Colaboratory, also known as Colab) and edge computing (utilizing an AMD Ryzen 5 4600H processor laptop). Those computing sectors focused on a case where no internet connectivity occurred to process the classification. Those classifications achieved accuracy rates of 82.88% and 81.78% for Wake-Sleep classification in cloud- and edge-computing, respectively. Additionally, it achieved 71.01% (Colab) and 68.85% (edge-computing) accuracy in Wake-NREM-REM classification. This result indicates that CNN-GRU achieved better performance, surpassing the previous Bi-GRU model, which only achieved 80.42% (Colab) and 76.2% (edge-computing) for Wake-Sleep, and 68.85% (Colab) and 66.43% for Wake-NREM-REM.

Similar Papers
  • Research Article
  • Cite Count Icon 2
  • 10.1080/15567036.2024.2318485
A short-term wind speed prediction method utilizing rolling decomposition and time-series extension to avoid information leakage
  • Feb 28, 2024
  • Energy Sources, Part A: Recovery, Utilization, and Environmental Effects
  • Pinhan Zhou + 4 more

The accuracy of wind speed prediction is crucial for the efficient operation and scheduling of power grids. In recent years, many wind speed prediction methods have been proposed, but the results have always been unsatisfactory, and the model accuracy in experimental testing has always been overestimated. This study focuses on the problem of information leakage caused by the decomposition of the test and general training sets in traditional wind speed prediction methods. Using the original model without decomposition as the standard and the mean average (P MAE ) and mean squared (P MSE ) errors as evaluation metrics, the overestimation degree of information leakage on the model accuracy was quantified. The results show that when the test set is decomposed together, the accuracy of the model is significantly overestimated. Specifically, the overestimation of P MAE ranges from 40% to 55%, and that of P MSE is from 65% to 85%. In addition, a singular spectrum analysis (SSA) – rolling decomposition (RD) – convolutional neural network (CNN) – bidirectional gated recurrent unit (BiGRU) – attention mechanism (AM) model based on the RD method was proposed. First, SSA was used to denoise the wind speed sequence, and then RD was performed on the original sequence to provide input vectors for the neural network model. Then, the CNN – BiGRU – AM hybrid neural network module predicted the wind speed sequence. Finally, to suppress the impact of boundary effects on the model accuracy, a time-series extension strategy based on neural networks was incorporated into the model. An example analysis indicates that the SSA – RD – CNN – BiGRU – AM model can avoid information leakage compared with other traditional models.

  • Research Article
  • 10.26634/jit.11.4.19119
Text emotion recognition using fast text word embedding in bi-directional gated recurrent unit
  • Jan 1, 2022
  • i-manager's Journal on Information Technology
  • Devi C Akalya + 5 more

Emotions are states of readiness in the mind that result from evaluations of one's own thinking or events. Although almost all of the important events in our lives are marked by emotions, the nature, causes, and effects of emotions are some of the least understood parts of the human experience. Emotion recognition is playing a promising role in the domains of human-computer interaction and artificial intelligence. A human's emotions can be detected using a variety of methods, including facial gestures, blood pressure, body movements, heart rate, and textual data. From an application standpoint, the ability to identify human emotions in text is becoming more and more crucial in computational linguistics. In this work, we present a classification methodology based on deep neural networks. The Bi-directional Gated Recurrent Unit (Bi-GRU) employed here demonstrates its effectiveness on the Multimodal Emotion Lines Dataset (MELD) when compared to Convolutional Neural Networks (CNN) and Long Short-Term Memory (LSTM). For word encoding, a comparison of three pre-trained word embeddings namely Glove, Word2Vec, and fastText is made. The findings from the MELD corpus support the conclusion that fastText is the best word embedding for the proposed Bi-GRU model. The experiment utilized the "glove.6B.300d" vector space. It consists of two million word representations in 300 dimensions trained on Common Crawl with sub-word information (600 billion tokens). The accuracy scores of GloVe, Word2Vec, and fastText (300 dimensions each) are tabulated and studied in order to highlight the improved results with fastText on the MELD dataset tested. It is observed that the Bidirectional Gated Recurrent Unit (Bi-GRU) with fastText word embedding outperforms GloVe and Word2Vec with an accuracy of 79.7%.

  • Research Article
  • Cite Count Icon 71
  • 10.1145/3358192
Achieving Super-Linear Speedup across Multi-FPGA for Real-Time DNN Inference
  • Oct 8, 2019
  • ACM Transactions on Embedded Computing Systems
  • Weiwen Jiang + 6 more

Real-time Deep Neural Network (DNN) inference with low-latency requirement has become increasingly important for numerous applications in both cloud computing (e.g., Apple’s Siri) and edge computing (e.g., Google/Waymo’s driverless car). FPGA-based DNN accelerators have demonstrated both superior flexibility and performance; in addition, for real-time inference with low batch size, FPGA is expected to achieve further performance improvement. However, the performance gain from the single-FPGA design is obstructed by the limited on-chip resource. In this paper, we employ multiple FPGAs to cooperatively run DNNs with the objective of achieving super-linear speed-up against single-FPGA design. In implementing such systems, we found two barriers that hinder us from achieving the design goal: (1) the lack of a clear partition scheme for each DNN layer to fully exploit parallelism, and (2) the insufficient bandwidth between the off-chip memory and the accelerator due to the growing size of DNNs. To tackle these issues, we propose a general framework, “Super-LIP”, which can support different kinds of DNNs. In this paper, we take Convolutional Neural Network (CNN) as a vehicle to illustrate Super-LIP. We first formulate an accurate system-level model to support the exploration of best partition schemes. Then, we develop a novel design methodology to effectively alleviate the heavy loads on memory bandwidth by moving traffic from memory bus to inter-FPGA links. We implement Super-LIP based on ZCU102 FPGA boards. Results demonstrate that Super-LIP with 2 FPGAs can achieve 3.48× speedup, compared to the state-of-the-art single-FPGA design. What is more, as the number of FPGAs scales up, the system latency can be further reduced while maintaining high energy efficiency.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 49
  • 10.3390/su16031012
Fault Detection and Diagnosis of a Photovoltaic System Based on Deep Learning Using the Combination of a Convolutional Neural Network (CNN) and Bidirectional Gated Recurrent Unit (Bi-GRU)
  • Jan 24, 2024
  • Sustainability
  • Ahmed Faris Amiri + 4 more

The meticulous monitoring and diagnosis of faults in photovoltaic (PV) systems enhances their reliability and facilitates a smooth transition to sustainable energy. This paper introduces a novel application of deep learning for fault detection and diagnosis in PV systems, employing a three-step approach. Firstly, a robust PV model is developed and fine-tuned using a heuristic optimization approach. Secondly, a comprehensive database is constructed, incorporating PV model data alongside monitored module temperature and solar irradiance for both healthy and faulty operation conditions. Lastly, fault classification utilizes features extracted from a combination consisting of a Convolutional Neural Network (CNN) and Bidirectional Gated Recurrent Unit (Bi-GRU). The amalgamation of parallel and sequential processing enables the neural network to leverage the strengths of both convolutional and recurrent layers concurrently, facilitating effective fault detection and diagnosis. The results affirm the proposed technique’s efficacy in detecting and classifying various PV fault types, such as open circuits, short circuits, and partial shading. Furthermore, this work underscores the significance of dividing fault detection and diagnosis into two distinct steps rather than employing deep learning neural networks to determine fault types directly.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 23
  • 10.1109/access.2020.3006569
Sentiment Analysis via Deep Multichannel Neural Networks With Variational Information Bottleneck
  • Jan 1, 2020
  • IEEE Access
  • Tong Gu + 2 more

With the rapid development of e-commerce, online consumption has become a mainstream form of consumption in recent years. Text sentiment analysis for a large number of customer reviews on the e-commerce platform can dramatically improve the customers' consumption experience. Although the sentiment analysis approaches based on deep neural network can achieve higher accuracy without human-design features compared with traditional sentiment analysis methods, the accuracy still cannot meet the demand and the training suffers from the issues of over-fitting, vanishing gradient, etc. In this paper, a novel sentiment analysis model named MBGCV is designed to alleviate these problems and improve the accuracy, MBGCV employs a multichannel paradigm and integrates Bidirectional Gated Recurrent Unit (BiGRU), Convolutional Neural Network (CNN) and Variational Information Bottleneck (VIB). The multichannel is exploited to extract multi-grained sentiment features. In each channel, BiGRU is utilized to extract context information, and then CNN is adopted to extract local features. Furthermore, the model combines the advantages of VIB and Maxout activation function to overcome shortcomings such as over-fitting, vanishing gradient in existing sentiment analysis models. By using real review datasets, we carry out extensive experiments to demonstrate that our proposed model can achieve superior accuracy and improve the performance of text sentiment analysis.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 4
  • 10.11591/eei.v13i2.6938
Empowering hate speech detection: leveraging linguistic richness and deep learning
  • Apr 1, 2024
  • Bulletin of Electrical Engineering and Informatics
  • I Gde Bagus Janardana Abasan + 1 more

Social media has become a vital part of most modern human personal life. Twitter is one of the social media that was formed from the development of communication technology. A lot of social media gives users the freedom to express themselves. This facility is misused by users, so hate speech is spread. Designing a system to detect hate speech intelligently is needed. This study uses the hybrid deep learning (HDL) and solo deep learning (SDL) approach with the convolutional neural networks (CNN) and bidirectional gated recurrent unit (Bi-GRU) algorithm. There are 4 models built, namely CNN, Bi-GRU, CNN+Bi-GRU, and Bi-GRU+CNN. Term frequency-inverse document frequency (TF-IDF) is used for feature extraction, which is to get linguistic features to be analyzed and studied. FastText is used to perform feature expansion to minimize mismatched vocabulary. Four scenarios are run. CNN with an accuracy of 87.63%, Bi-GRU produces an accuracy of 87.46%, CNN+Bi-GRU provides an accuracy of 87.47% and Bi-GRU+CNN provides an accuracy of 87.34%. The ability of this approach to understand the context is qualified. HDL outperforms SDL in terms of n-gram type, where HDL can understand sentences broken down by hybrid n-gram types, namely Unigram-Bigram-Trigram which is a complex n-gram hybrid.

  • Research Article
  • 10.54392/irjmt25613
Deceptive News Content Detection using a Hybrid Transformer-based and Deep Learning Model with Explainability
  • Nov 28, 2025
  • International Research Journal of Multidisciplinary Technovation
  • Arati M Chabukswar + 3 more

The growth of social media platforms has facilitated knowledge dissemination. The ability of misinformation to affect elections, public opinion, and instigate instability makes it a dangerous threat to civilization that spreads rapidly. For an informed and reliable information ecology to survive, the ability to identify deceptive information in an extensive variety of languages is essential. The Transformer based pretrained language models (TB-PLMs) like Distilled Bidirectional Encoder Representations from Transformers (DistilBERT), ALite BERT (ALBERT) and Robustly Optimized BERT Pretraining Approach (RoBERTa) versions of the BERT model with a deep neural network structures such as Bi-directional Gated Recurrent Unit (BiGRU) and Convolution Neural Network (CNN) is used for the identification of deceptive news in English. The dataset utilized for the challenge consists of a combination of LIAR, and Fake/Real news dataset, resulting in a Combined Corpus (CC) dataset about politics. TB-PLMs are optimized to understand the semantic linkages and contextual information found in the dataset. BiGRU and CNN layers are used to capture the dependencies between neighboring characters in the text. The experimental findings show that the RoBERTa+BiGRU model performs better in comparison with all the other models in identifying English deceptive news with an accuracy of 99.04%. The results obtained reflect that RoBERTa+BiGRU has rise of 0.06% in accuracy from the base RoBERTa model. Also, the results of proposed work on DistilBERT+BiGRU and RoBERTa+BiGRU model performs well based on features (class 0 and class 1) while utilizing Local Interpretable Model-agnostic Explanations (LIME) implementation to clarify the target labels which can facilitate valid data extraction and processing to successfully counteract deceptive information.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 10
  • 10.3390/computers9020036
Advanced Convolutional Neural Network-Based Hybrid Acoustic Models for Low-Resource Speech Recognition
  • May 2, 2020
  • Computers
  • Tessfu Geteye Fantaye + 2 more

Deep neural networks (DNNs) have shown a great achievement in acoustic modeling for speech recognition task. Of these networks, convolutional neural network (CNN) is an effective network for representing the local properties of the speech formants. However, CNN is not suitable for modeling the long-term context dependencies between speech signal frames. Recently, the recurrent neural networks (RNNs) have shown great abilities for modeling long-term context dependencies. However, the performance of RNNs is not good for low-resource speech recognition tasks, and is even worse than the conventional feed-forward neural networks. Moreover, these networks often overfit severely on the training corpus in the low-resource speech recognition tasks. This paper presents the results of our contributions to combine CNN and conventional RNN with gate, highway, and residual networks to reduce the above problems. The optimal neural network structures and training strategies for the proposed neural network models are explored. Experiments were conducted on the Amharic and Chaha datasets, as well as on the limited language packages (10-h) of the benchmark datasets released under the Intelligence Advanced Research Projects Activity (IARPA) Babel Program. The proposed neural network models achieve 0.1–42.79% relative performance improvements over their corresponding feed-forward DNN, CNN, bidirectional RNN (BRNN), or bidirectional gated recurrent unit (BGRU) baselines across six language collections. These approaches are promising candidates for developing better performance acoustic models for low-resource speech recognition tasks.

  • Conference Article
  • Cite Count Icon 15
  • 10.1145/3289602.3293988
XFER
  • Feb 20, 2019
  • Weiwen Jiang + 6 more

Real-time inference with low latency requirement has become increasingly important for numerous applications in both cloud computing and edge computing. The FPGA-based Deep Neural Network (DNN) accelerators have demonstrated the superior performance and energy efficiency over CPUs and GPUs; in addition, for real-time AI with low batch size, FPGA is expected to achieve further performance improvement over the general purpose computing platform. However, the performance gain of the single-FPGA design is hindered by the limited on-chip resource. In this paper, we leverage a cluster of FPGAs to fully exploit the parallelism in DNNs with the objective of obtaining super-linear performance. To achieve this goal, a novel design, XFER, is proposed to deploy DNNs to FPGA cluster by splitting the DNN layer to multiple FPGAs and moving traffics from memory bus to inter-FPGA links. The resultant system can achieve both workload balance and traffic balance. As a case study, we implement Convolutional Neural Networks (CNNs) on ZCU102 FPGA boards. Evaluation results demonstrate that XFER on two FPGAs can achieve 3.48x speedup compared with state-of-the-art FPGA designs, achieving super-linear speedup.

  • Conference Article
  • Cite Count Icon 1
  • 10.1109/icca51439.2020.9264431
Dimension-raising Processing Framework for One-dimensional Time Series and its Application in Affect Detection
  • Oct 9, 2020
  • Ziman Ye + 3 more

This paper explores the application of neural networks in the affect detection from Electrocardiograph(ECG) data. Affect recognition is one of the most challenging tasks. Because of the great cultural and personalized differences in the image and sound-based affect detection, and the physiological signal response of emotion is more universal and accurate, we detect emotions from physiological signals. This paper proposed a detection framework for single-modal physiological signals. This work detects affect from nonstationary ECG data. In this work, we use ECG data from the dataset from UCI named Multimodal Dataset for wearable Stress and Affect Detection(WESAD), include ECG signals from 15 Subjects. To extract features from the ECG data, We propose a stacking operation to increase ECG data's dimension. with this operation, we use a convolutional neural network (CNN) to extract multiscale periodical features of ECG data easily. Experimental results show that the stack based VGG can capable of classifying four and five different kinds of affect with an accuracy of 97.78% and 95.87% respectively. The high-dimensional convolutional neural network provides better performance compared to one-dimensional convolutional neural network models. This approach can also be applied to other applications of single-modal physiological signals.

  • Research Article
  • Cite Count Icon 70
  • 10.1007/s12652-019-01239-9
Evaluating fusion of RGB-D and inertial sensors for multimodal human action recognition
  • Feb 12, 2019
  • Journal of Ambient Intelligence and Humanized Computing
  • Javed Imran + 1 more

Fusion of multiple modalities from different sensors is an important area of research for multimodal human action recognition. In this paper, we conduct an in-depth study to investigate the effect of different parameters like input preprocessing, data augmentation, network architectures and model fusion so as to come up with a practical guideline for multimodal action recognition using deep learning paradigm. First, for RGB videos, we propose a novel image-based descriptor called stacked dense flow difference image (SDFDI), capable of capturing the spatio-temporal information present in a video sequence. A variety of deep 2D convolutional neural networks (CNN) are then trained to compare our SDFDI against state-of-the-art image-based representations. Second, for skeleton stream, we propose data augmentation technique based on 3D transformations so as to facilitate training a deep neural network on small datasets. We also propose a bidirectional gated recurrent unit (BiGRU) based recurrent neural network (RNN) to model skeleton data. Third, for inertial sensor data, we propose data augmentation based on jittering with white Gaussian noise along with deep a 1D-CNN network for action classification. The outputs of all these three heterogeneous networks (1D-CNN, 2D-CNN and BiGRU) are combined by a variety of model fusion approach based on score and feature fusion. Finally, in order to illustrate the efficacy of the proposed framework, we test our model on a publicly available UTD-MHAD dataset, and achieved an overall accuracy of 97.91%, which is about 4% higher than using each modality individually. We hope that the discussions and conclusions from this work will provide a deeper insight to the researchers in the related fields, and provide avenues for further studies for different multi-sensor based fusion architectures.

  • Book Chapter
  • 10.1201/9781003279044-5
Reinforcement of the Multi-Cloud Infrastructure with Edge Computing
  • Oct 12, 2022
  • L Steffina Morin + 1 more

Cloud computing and Edge computing are two new technologies that have the potential to improve people’s daily lives. Furthermore, the integration of cloud computing and edge computing has been enhancing the productivity of a vast range of applications in industries such as supply chains, health care, commercial, engineering, and manufacturing, among others. Security is currently a major concern in cloud-edge computing. Modern Edge Cloud security is based on a collection of naive security services such as distribution of keys, authentication, and access control, which are typically deployed at the cloud as well as edge level. In recent years, a number of privacy-preserving models and security techniques have been created by a wide range of academics for this goal. Despite this, they have failed to meet the most recent cloud-edge user’s security expectations. In this chapter, we have proposed the private cloud-edge infrastructure using Openstack which is vulnerable to cyber attacks. Complex security needs at the cloud’s edge entail the provision of a large number of basic services.

  • PDF Download Icon
  • Research Article
  • Cite Count Icon 2
  • 10.1155/2022/1534596
Automatic Recognition and Repair System of Mural Image Cracks Based on Cloud Edge Computing and Digitization
  • Oct 10, 2022
  • Mobile Information Systems
  • Yongli Gao + 1 more

Mural painting is the art on the wall, it is the painting that people draw on the wall, it is one of the earliest forms of painting in human history, and it is also an accessory part of the building. The decorative and beautifying functions of murals make them an important aspect of environmental art. Cloud edge computing is a combination of cloud computing and edge computing, that fully absorbs the advantages of both cloud computing and edge computing and maximizes their advantages. In this study, based on cloud edge computing and digital technology, the automatic identification and repair system of fresco image cracks is studied. Image segmentation techniques have been proposed in this study, using 60 murals in three regions as experimental objects. Through experimental analysis, it is found that the traditional pine poise treatment method takes the shortest repair time. However, for a specific image, it is difficult to guarantee the quality of its restoration. The mural image in area A was repaired with the conventional pine pitch repair method, which took 113.01 seconds, and the subjective evaluation was 69 points. Using the repair method described in this study to repair, it takes 127.38 seconds, and its subjective evaluation score is the highest, which is 87 points. The experimental results have shown that the cloud edge computing method and digital technology have had a certain positive effect on the identification and repair system of fresco image cracks.

  • Book Chapter
  • Cite Count Icon 3
  • 10.1007/978-981-19-5868-7_37
An Enhanced Deep Learning Approach for Smartphone-Based Human Activity Recognition in IoHT
  • Jan 1, 2023
  • Vaibhav Soni + 5 more

Human activity recognition (HAR) uses sensor-based technology to predict human activity using sensor-generated time-series data. According to recent studies, researchers have been drawn to the area of HAR as the use of mobile devices with various sensors has increased in several research areas in health care that includes the identification of gait abnormality in brain or neurological disorder subjects, designing techniques for clinical gait analyses of a disabled and elderly person, etc. HAR is especially essential on the Internet of healthcare things (IoHT) because of the increasing growth of the Internet of Things (IoT) technology incorporated in numerous smart products and wearable technology (such as smartwatches and smartphones) that have a significant impact on the life of human. A deep neural network (DNN) comprising a bidirectional gated recurrent unit (BiGRU) and two convolutional layers is proposed in this research. The model used could automatically extract and identify activity features using a few of the architecture parameters. The raw data obtained using smartphone sensors is sent into a two layer BiGRU followed by two CNN layers in the proposed architecture. The model’s performance is assessed using two publicly available datasets (UCI-HAR and WISDM). The observations indicate that the reliability and activity identification capability of the proposed model is better than earlier findings.KeywordsHuman activity recognitionIoHTDeep learningBidirectional gated recurrent unit (BiGRU)Convolutional neural network (CNN)

  • Research Article
  • Cite Count Icon 27
  • 10.1016/j.neucom.2020.04.084
HieNN-DWE: A hierarchical neural network with dynamic word embeddings for document level sentiment classification
  • Apr 28, 2020
  • Neurocomputing
  • Fagui Liu + 2 more

HieNN-DWE: A hierarchical neural network with dynamic word embeddings for document level sentiment classification

Save Icon
Up Arrow
Open/Close
  • Ask R Discovery Star icon
  • Chat PDF Star icon

AI summaries and top papers from 250M+ research sources.