Similar Papers
  • Research Article
  • 10.1093/europace/euac053.560
Studying the relation between post-infarction ventricular arrhythmia and left ventricular myocardium thinning on computed tomography images using explainable deep learning
  • May 19, 2022
  • EP Europace
  • B Ly + 8 more

Studying the relation between post-infarction ventricular arrhythmia and left ventricular myocardium thinning on computed tomography images using explainable deep learning

  • Supplementary Content
  • Cite Count Icon 2
  • 10.1155/2022/4670963
Analysis of Music Retrieval Based on Emotional Tags Environment.
  • Jan 1, 2022
  • Journal of environmental and public health
  • Nuan Bao

In general, tags are used to interpret the content of music, while the music itself expresses emotion. The emotional information conveyed by the same music is described by a large number of emotion tags in various ways. This paper proposes and establishes an algorithm for music retrieval based on emotional tags. By modelling user emotional tags and music, a bipartite graph with emotional tags and music as nodes is first created. The tags and semantic similarity between the music are then calculated using the T_SimRank algorithm, and the popularity of the music is calculated using the T_PageRank algorithm. Finally, the two methods are combined using the concept of ranking learning to produce the final ranking of the music. Experiments demonstrate that the method suggested in this paper can better satisfy user retrieval needs than conventional cosine similarity and tag co-occurrence-based similarity methods and that the fusion of multiple methods is preferable to a single method.

  • Conference Article
  • Cite Count Icon 38
  • 10.1109/aciiasia.2018.8470388
Emotional Human Machine Conversation Generation Based on SeqGAN
  • May 1, 2018
  • Xiao Sun + 3 more

In recent years, artificial intelligence has made a significant breakthrough and progress in the field of humanmachine conversation. However, how to generate high-quality, emotional and subhuman conversation still a troublesome work. The key factor of man-machine dialogue is whether the chatbot can give a good response in content and emotional level. How to ensure that the robot understands the user’s emotions, and consider the user’s emotions then give a satisfactory response. In this paper, we add the emotional tags to the post and response from the dataset respectively. The emotional tags, as the emotional tags of post and response, represent the emotions expressed by this sentence. The purpose of our emotional tags is to make the chatbot understood the emotion of the input sequence more directly so that it has a recognition of the emotional dimension. In this paper, we apply the mechanism of GAN network on our conversation model. For the generator: We make full use of Encoder-Decoder structure form a seq2seq model, which is used to generate a sentence’s response. For the discriminator: distinguish between the human-generated dialogues and the machine-generated ones.The outputs from the discriminator are used as rewards for the generative model, pushing the system to generate dialogues that mostly resemble human dialogues. We cast our task as an RL(Reinforcement Learning) problem, using a policy gradient method to reward more subhuman conversational sequences, and in addition we have added an emotion tags to represent the response we want to get, which we will use as a rewarding part of it, so that the emotions of real responses can be closer to the emotions we specify. Our experiment shows that through the introduction of emotional intelligence, our model can generate responses appropriate not only in content but also in emotion, which can be used to control and adjust users emotion. Compared with our previous work, we get a better performance on the same data set, and we get less ’’safe’’ response than before, but there will be a certain degree of existence.

  • Book Chapter
  • Cite Count Icon 2
  • 10.1007/978-1-4471-4555-4_11
Toward Emotional Annotation of Multimedia Contents
  • Oct 13, 2012
  • Ashkan Yazdani + 2 more

By annotating multimedia contents, users of a web resource can associate a word or a phrase (tag) with that resource such that other users can retrieve it by means of searching. Nowadays, tags play an important role in search and retrieval process in multimedia content sharing social networks. Explicit tagging refers to assigning tags directly in an explicit way such as typing. Implicit tagging, however, refers to assigning tags by observing users’ behaviors during exposure to multimedia contents. Among various kinds of information that can be obtained for the purpose of implicit tagging, emotional information about a given content is of great interest. In this chapter, we discuss various means of emotion recognition and emotional characterization, which can be used as tools for emotional tagging. A P300-based brain-computer interface system is proposed for the purpose of emotional tagging of multimedia content. We show that this system can successfully perform emotional tagging and naive users who have not participated in the training of the system can also use it efficiently. Furthermore, we present emotional annotating systems using multimedia content analysis and electroencephalogram signal processing and will compare them. Finally, a road map for developing a practical multimodal system for implicit emotional annotation of multimedia contents will be sketched out.KeywordsVideo ClipEmotion RecognitionMultimedia ContentGalvanic Skin ResponseEmotional CategoryThese keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

  • Research Article
  • Cite Count Icon 3
  • 10.1049/ccs2.12038
AHRNN: Attention‐Based Hybrid Robust Neural Network for emotion recognition
  • Feb 22, 2022
  • Cognitive Computation and Systems
  • Ke Xu + 5 more

In order to solve the problem that the existing methods cannot effectively capture the semantic emotion of the sentence when faced with the lack of cross‐language corpus, it is difficult to effectively perform cross‐language sentiment analysis, we propose a neural network architecture called the Attention‐Based Hybrid Robust Neural Network. The proposed architecture includes pre‐trained word embedding with fine‐tuning training to obtain prior semantic information, two sub‐networks and attention mechanism to capture the global semantic emotional information in the text, and a fully connected layer and softmax function to jointly perform final emotional classification. The Convolutional Neural Networks sub‐network captures the local semantic emotional information of the text, the BiLSTM sub‐network captures the contextual semantic emotional information of the text, and the attention mechanism dynamically integrates the semantic emotional information to obtain key emotional information. We conduct experiments on Chinese (International Conference on Natural Language Processing and Chinese Computing) and English (SST) datasets. The experiment is divided into three subtasks to evaluate the superiority of our method. It improves the recognition accuracy of single sentence positive/negative classification from 79% to 86% in the single‐language emotion recognition task. The recognition performance of fine‐grained emotional tags is also improved by 9.6%. The recognition accuracy of cross‐language emotion recognition tasks has also been improved by 1.5%. Even in the face of faulty data, the performance of our model is not significantly reduced when the error rate is less than 20%. These experimental results prove the superiority of our method.

  • Research Article
  • Cite Count Icon 30
  • 10.1609/aaai.v34i02.5538
Draft and Edit: Automatic Storytelling Through Multi-Pass Hierarchical Conditional Variational Autoencoder
  • Apr 3, 2020
  • Proceedings of the AAAI Conference on Artificial Intelligence
  • Meng-Hsuan Yu + 6 more

Automatic Storytelling has consistently been a challenging area in the field of natural language processing. Despite considerable achievements have been made, the gap between automatically generated stories and human-written stories is still significant. Moreover, the limitations of existing automatic storytelling methods are obvious, e.g., the consistency of content, wording diversity. In this paper, we proposed a multi-pass hierarchical conditional variational autoencoder model to overcome the challenges and limitations in existing automatic storytelling models. While the conditional variational autoencoder (CVAE) model has been employed to generate diversified content, the hierarchical structure and multi-pass editing scheme allow the story to create more consistent content. We conduct extensive experiments on the ROCStories Dataset. The results verified the validity and effectiveness of our proposed model and yields substantial improvement over the existing state-of-the-art approaches.

  • Research Article
  • Cite Count Icon 86
  • 10.1002/lpor.202000287
Generative Deep Learning Model for Inverse Design of Integrated Nanophotonic Devices
  • Oct 20, 2020
  • Laser & Photonics Reviews
  • Yingheng Tang + 9 more

A novel conditional variational autoencoder (CVAE) model for designing nanopatterned integrated photonic components is proposed. In particular, it is shown that prediction capability of the CVAE model can be significantly improved by adversarial censoring and active learning. Moreover, generation of nanopatterned power splitters with arbitrary splitting ratios and 550 nm broadband optical responses from 1250 to 1800 nm are demonstrated. Nanopatterned power splitters with footprints of 2.25 × 2.25 m2 and 20 × 20 etch hole positions are the design space, with each hole position assuming a radius from a range of radii. Designed nanopatterned power splitters using methods presented herein demonstrate an overall transmission of about 90% across the operating bandwidth from 1250 to 1800 nm. To the best of authors' knowledge, this is the first time that a state‐of‐the‐art CVAE deep neural network model is successfully used to design a physical device.

  • Research Article
  • Cite Count Icon 33
  • 10.3390/s23073457
Application of Variational AutoEncoder (VAE) Model and Image Processing Approaches in Game Design
  • Mar 25, 2023
  • Sensors (Basel, Switzerland)
  • Hugo Wai Leung Mak + 2 more

In recent decades, the Variational AutoEncoder (VAE) model has shown good potential and capability in image generation and dimensionality reduction. The combination of VAE and various machine learning frameworks has also worked effectively in different daily life applications, however its possible use and effectiveness in modern game design has seldom been explored nor assessed. The use of its feature extractor for data clustering has also been minimally discussed in the literature neither. This study first attempts to explore different mathematical properties of the VAE model, in particular, the theoretical framework of the encoding and decoding processes, the possible achievable lower bound and loss functions of different applications; then applies the established VAE model to generate new game levels based on two well-known game settings; and to validate the effectiveness of its data clustering mechanism with the aid of the Modified National Institute of Standards and Technology (MNIST) database. Respective statistical metrics and assessments are also utilized to evaluate the performance of the proposed VAE model in aforementioned case studies. Based on the statistical and graphical results, several potential deficiencies, for example, difficulties in handling high-dimensional and vast datasets, as well as insufficient clarity of outputs are discussed; then measures of future enhancement, such as tokenization and the combination of VAE and GAN models, are also outlined. Hopefully, this can ultimately maximize the strengths and advantages of VAE for future game design tasks and relevant industrial missions.

  • Research Article
  • Cite Count Icon 9
  • 10.1088/2632-2153/ad2e18
Robust errant beam prognostics with conditional modeling for particle accelerators
  • Mar 1, 2024
  • Machine Learning: Science and Technology
  • Kishansingh Rajput + 7 more

Particle accelerators are complex and comprise thousands of components, with many pieces of equipment running at their peak power. Consequently, they can fault and abort operations for numerous reasons, lowering efficiency and science output. To avoid these faults, we apply anomaly detection techniques to predict unusual behavior and perform preemptive actions to improve the total availability. Supervised machine learning (ML) techniques such as siamese neural network models can outperform the often-used unsupervised or semi-supervised approaches for anomaly detection by leveraging the label information. One of the challenges specific to anomaly detection for particle accelerators is the data’s variability due to accelerator configuration changes within a production run of several months. ML models fail at providing accurate predictions when data changes due to changes in the configuration. To address this challenge, we include the configuration settings into our models and training to improve the results. Beam configurations are used as a conditional input for the model to learn any cross-correlation between the data from different conditions and retain its performance. We employ conditional siamese neural network (CSNN) models and conditional variational auto encoder (CVAE) models to predict errant beam pulses at the spallation neutron source under different system configurations and compare their performance. We demonstrate that CSNNs outperform CVAEs in our application.

  • Research Article
  • Cite Count Icon 3
  • 10.1007/s11042-017-5125-8
Identifying affective levels on music video via completing the missing modality
  • Aug 23, 2017
  • Multimedia Tools and Applications
  • Mo Chen + 2 more

Emotion tagging is one theme of interest in affective computing, which labels stimuli with human understandable semantic information. Previous works indicate that modality fusion could improve the performance of this kind of tasks. However, acquiring the subjects’ responses is costly and time consuming, leading to that the response modality is absent for large part of multimedia contents, which is required by modality fusion methods. To address this problem, in this paper a novel emotion tagging framework is proposed, which completes the missing response modalities based on the conception of brain encoding. In the framework, an encoding model is built based on the response modality from subjects’ responses and the stimulus modality from stimulus contents. Then the model is applied to those videos whose response modalities are absent to complete the missing response modalities. Modality fusion is finally conducted on stimulus modality and response modality and followed by the classification methods. To test the performance of the proposed framework, DEAP dataset is adopted as a benchmark. In the experiments, three kinds of features are employed as stimulus modalities. Response modality and fused modality are computed under the proposed framework. Affective level identification is conducted as emotion tagging task. The results demonstrate that the accuracies of the proposed framework outperforms the accuracies obtained by using only stimulus modality. The improvements are higher than 5% for all kinds of stimulus modalities in valence and arousal in terms of accuracy. Additionally, the improvement of performance introduces no extra physiological data acquisition, saving economical and timing costs.

  • Research Article
  • Cite Count Icon 1
  • 10.2478/amns.2023.2.01047
Emotional Expression and Information Communication in English Texts Based on Artificial Intelligence Technology
  • Nov 8, 2023
  • Applied Mathematics and Nonlinear Sciences
  • Jingbo Hao + 1 more

This paper firstly researches English text emotion expression and information communication, classifies English text emotion expression and information communication according to the human emotion-value relationship, and summarizes the characteristics of English emotion expression and information communication. Secondly, using artificial intelligence technology, it is proposed to construct an analysis model for English text emotion and information communication using the BiLSTM neural network. To deal with the characteristics of English text quickly and efficiently, it is necessary to encode the emotional information of English text, and based on encoding, the BiLSTM neural network is applied to extract the emotional features of English text and solve the problem of the loss of emotional features through the loss function. Then, the crawler tool is used to obtain the dataset from the Chinese English module under the MOOC of Chinese universities, and the evaluation indexes are set according to the model’s performance, followed by the experimental analysis of the English text emotion expression and information conveyance. The results show that compared with the original CNN, LSTM, and T-LSTM, the BiLSTM-based neural network performs better in the task of text emotion expression and information conveyance, with the accuracy rate staying above 0.925, and the effect on the English dataset is a bit better than that on the Chinese dataset. This study aims to enhance English teaching and communication between Chinese and foreign cultures.

  • Research Article
  • 10.1200/jco.2025.43.16_suppl.e13665
A transfer-learned hierarchical variational autoencoder model for computational design of anticancer peptides.
  • Jun 1, 2025
  • Journal of Clinical Oncology
  • Farzad Midjani + 7 more

e13665 Background: Cancer is a leading cause of mortality globally, necessitating the development of effective therapies. Anticancer peptides (ACPs) show promises given their selective targeting of cancer cells, low toxicity, and ability to overcome drug resistance. However, traditional discovery and optimization methods are time-consuming and expensive, underscoring the need for efficient computational strategies. Methods: We developed a Deep Hierarchical Conditional Variational Autoencoder (CVAE) for de novo ACP design, using transfer learning by initializing the ESM-2 pre-trained encoder. A comprehensive ACP dataset was collected from 9 different databases. Dataset contained 3,209 confirmed ACP and 4,292 non-ACP sequences. The non-ACP dataset was refined with the CD-HIT program, which clusters and removes redundant sequences, to keep those with < 40% sequence similarity to ACPs. Next, the CVAE encoder used local and global feature extractors to map peptides into a 256-dimensional latent space. The CVAE decoder reconstructed sequences using Gated recurrent unit (GRU) and Transformer layers with nucleus sampling. The CVAE multi-task classifier predicts anticancer and non-anticancer properties, guiding generation of highly active sequences. A gradual fine-tuning strategy was employed, progressively unfreezing the last 6 layers of ESM-2 encoder and applying discriminative learning rates with the AdamW optimizer. The model was trained using a combination of reconstruction loss, Kullback–Leibler (KL) divergence, and classification loss, optimized for balanced performance. For peptide generation, latent vectors were sampled using a Variational Gaussian Mixture Model, decoded into sequences, and filtered based on length, amino acid validity and anticancer probability (> 0.5). Results: The model shows robust performance in different domains. In classification, it achieved 0.89 accuracy, 0.88 precision, 0.87 recall, F1 of 0.87, and an area under receiver operating characteristic curve of 0.94 on the test set, indicating strong discriminative ability between ACPs and non-ACPs. Regarding generative capacity, the model successfully produced 100 unique anticancer peptides from 121 generation attempts, highlighting its ability to create viable candidates. In training, the loss functions consistently converged, the Fine-tune Loss and Reconstruction Loss decreased steadily, and the KL Loss remained stable to maintain meaningful latent representations affirming the model’s enhanced capability to reconstruct peptide sequences accurately. Conclusions: This study advances ACP design with a CVAE model integrating transfer learning and multi-task classification. The model's successful generation of new ACPs emphasize its potential to expedite the clinical translation and development of effective therapies.

  • Book Chapter
  • Cite Count Icon 3
  • 10.1007/978-3-030-53980-1_113
Speech Emotion Recognition Model Based on CRNN-CTC
  • Aug 13, 2020
  • Zijiang Zhu + 4 more

CRNN (Convolutional Recurrent Neural Network) deep learning model is currently a typical speech emotion recognition technology. When this model is applied, no matter how long the speech sequence is, it will only be converted into an emotional tag. However, the emotional information in speech samples is generally unevenly distributed between frames, which will cause the recognition performance of the model to deteriorate. For this problem, a speech emotion recognition model based on CRNN-CTC (Convolutional Recurrent Neural Network-Connectionist Temporal Classification) is proposed in this paper. On the basis of CRNN model, the speech samples are divided into emotional frames and non-emotional frames first, and then CTC method is used to make the network model focus on the emotional frames of speech for learning to avoid the problem of poor model performance due to the learning of non-emotional frames. Experimental results show that the model achieves the weighted average recall rate (WAR) of 70.11% and the unweighted average recall rate (UAR) of 69.53%. Compared with CRNN model, the performance of speech emotion recognition is significantly improved.

  • Research Article
  • Cite Count Icon 71
  • 10.1049/cit2.12153
A semantic and emotion‐based dual latent variable generation model for a dialogue system
  • Jan 10, 2023
  • CAAI Transactions on Intelligence Technology
  • Ming Yan + 4 more

With the development of intelligent agents pursuing humanisation, artificial intelligence must consider emotion, the most basic spiritual need in human interaction. Traditional emotional dialogue systems usually use an external emotional dictionary to select appropriate emotional words to add to the response or concatenate emotional tags and semantic features in the decoding step to generate appropriate responses. However, selecting emotional words from a fixed emotional dictionary may result in loss of the diversity and consistency of the response. We propose a semantic and emotion‐based dual latent variable generation model (Dual‐LVG) for dialogue systems, which is able to generate appropriate emotional responses without an emotional dictionary. Different from previous work, the conditional variational autoencoder (CVAE) adopts the standard transformer structure. Then, Dual‐LVG regularises the CVAE latent space by introducing a dual latent space of semantics and emotion. The content diversity and emotional accuracy of the generated responses are improved by learning emotion and semantic features respectively. Moreover, the average attention mechanism is adopted to better extract semantic features at the sequence level, and the semi‐supervised attention mechanism is used in the decoding step to strengthen the fusion of emotional features of the model. Experimental results show that Dual‐LVG can successfully achieve the effect of generating different content by controlling emotional factors.

  • Research Article
  • Cite Count Icon 3
  • 10.53759/7669/jmc202404095
Deep Learning with Crested Porcupine Optimizer for Detection and Classification of Paddy Leaf Diseases for Sustainable Agriculture
  • Oct 5, 2024
  • Journal of Machine and Computing
  • Hussain A + 1 more

India has a vast number of inhabitants and the main food source distribution is from agriculture. Agricultural lands are being demolished generally owing to plant and crop illnesses. The detection of plant diseases by using image processing models can aid agriculturalists in defending the farming area from damaging or affecting it. Paddy is the main harvest worldwide. Early recognition of the paddy diseases at dissimilar phases of development is very vital in paddy production. However, the present manual technique in identifying and classifying paddy diseases needs a very educated farmer and is time-consuming. Deep learning (DL) is an effectual research area in the classification of agriculture patterns where it can efficiently solve the problems of diseases identification. Therefore, the articles focus on the design and expansion of Deep Learning based Crested Porcupine Optimizer for the Detection and Classification of Paddy Leaf Diseases (DLCPO-DCPLD) method for Sustainable Agriculture. The main aim of the DLCPO-DCPLD method use DL method for the recognition and identification of rice plant leaf diseases. To accomplish this, the DLCPO-DCPLD technique performs the image pre-processing using Median Filtering (MF) to recover the excellence of the input frames. Next, the ConvNeXt-L method is applied for extraction of feature vectors from the pre-processed images. Also, the Conditional Variational Autoencoder (CVAE) model is utilized for the automated classification of Paddy Leaf diseases. Eventually, the hyperparameter tuning of the CVAE technique is accomplished by implementing the Crested Porcupine Optimizer (CPO) technique. To safeguard the enhanced predictive results of the DLCPO-DCPLD method, a sequence of experimentations is implemented on the benchmark dataset. The experimental validation of the DLCPO-DCPLD method portrayed a superior accuracy value of 99.12% over existing approaches.

Save Icon
Up Arrow
Open/Close
Notes

Save Important notes in documents

Highlight text to save as a note, or write notes directly

You can also access these Documents in Paperpal, our AI writing tool

Powered by our AI Writing Assistant