Video captioning in Vietnamese using deep learning

Dang Thi Phuc,Nguyen Van Tinh,Tran Quang Trieu,Dau Sy Hieu

doi:10.11591/ijece.v12i3.pp3092-3103

Abstract

<p><span>With the development of today's society, demand for applications using digital cameras jumps over year by year. However, analyzing large amounts of video data causes one of the most challenging issues. In addition to storing the data captured by the camera, intelligent systems are required to quickly analyze the data to correct important situations. In this paper, we use deep learning techniques to build automatic models that describe movements on video. To solve the problem, we use three deep learning models: sequence-to-sequence model based on recurrent neural network, sequence-to-sequence model with attention and transformer model. We evaluate the effectiveness of the approaches based on the results of three models. To train these models, we use microsoft research video description corpus (MSVD) dataset including 1970 videos and 85,550 captions translated into Vietnamese. In order to ensure the description of the content in Vietnamese, we also combine it with the natural language processing (NLP) model for Vietnamese.</span></p>

Talk to us

Join us for a 30 min session where you can share your feedback and ask us any queries you have

Schedule a call

Journal: International Journal of Electrical and Computer Engineering (IJECE)	Publication Date: Jun 1, 2022
Citations: 1	License type: CC BY-SA 4.0

R Discovery Prime

R Discovery Prime

Video captioning in Vietnamese using deep learning

Abstract

Talk to us

Similar Papers

More From: International Journal of Electrical and Computer Engineering (IJECE)

Lead the way for us

Similar Papers

Digital Health Transformers and Opportunities for Artificial Intelligence-Enabled Nephrology.
Benjamin Shickel ... Tezcan Ozrazgat-Baslanti
Clinical Journal of the American Society of Nephrology | VOL. 18
Benjamin Shickel, et. al.Benjamin Shickel ... Tezcan Ozrazgat-Baslanti
09 Feb 2023
Clinical Journal of the American Society of Nephrology | VOL. 18

Development of natural language processing (NLP) models for extracting key features from unstructured notes to create real-world data (RWD) assets for clinical research at scale.
Smita Agrawal ... Ashwani Ashwani
Journal of Clinical Oncology | VOL. 41
Smita Agrawal, et. al.Smita Agrawal ... Ashwani Ashwani
01 Jun 2023
Journal of Clinical Oncology | VOL. 41

NLP-Based Approach for Predicting HMI State Sequences Towards Monitoring Operator Situational Awareness.
Harsh V. P. Singh ... Qusay H. Mahmoud
Sensors (Basel, Switzerland) | VOL. 20
Harsh V. P. Singh, et. al.Harsh V. P. Singh ... Qusay H. Mahmoud
05 Jun 2020
Sensors (Basel, Switzerland) | VOL. 20

Abstract 184: The utility of deep metric learning for breast cancer identification on mammographic images
Justin Du ... Sanjay Aneja
Cancer Research | VOL. 81
Justin Du, et. al.Justin Du ... Sanjay Aneja
01 Jul 2021
Cancer Research | VOL. 81

Editage

Paperpal

R Discovery

Mind the Graph

R Discovery Prime

R Discovery Prime

Video captioning in Vietnamese using deep learning

Abstract

Talk to us

Similar Papers

More From: International Journal of Electrical and Computer Engineering (IJECE)