Massive Amounts Of Video Data Research Articles

This study focuses on the automatic detection of human actions in video streams. The requirement to detect what human activities happen in videos is recognition of human action due to significant differences in people's visual and motion appearance and actions, camera perspective shifts, moving background, occlusions, noise, and a massive amount of video data. The human activity recognition challenge involves identifying physical activities carried out by individuals or groups based on traces of movements, including gestures, actions, interactions, and group activities. The detection of concepts usually requires additional annotations for the training dataset. In this paper, useful methods for categorizing human action recognition are discussed. The current models are an accurate deep learning method that is based on models that have been changed to be more useful. The large disparities that result from the backdrop and the size of the objects have prevented the identification of activities in videos from being fully and effectively addressed. The main objective is to achieve better accuracy for the Long Short-Term Memory (LSTM) method, which was used to improve the Recurrent Neural Networks (RNN) model. In this paper, LSTM is used to come up with models for different action recognition tasks. The model was made better by making the LSTM have four layers and putting 128 units, 64 units, 32 units, and 16 units in each layer, respectively. In addition, the performance evaluation of deep learning-based approaches has been compared to other related works. Therefore, an improved approach to RNN is proposed to recognize human actions. To classify the videos, a multilayer RNN with a specific type of LSTM is used to extract features from video sequences. The UCF-101 and UCF Sports human action recognition datasets are utilized in this study for both training and assessment. Test findings demonstrate that the suggested strategy achieved increased accuracy. Finally, the enhanced RNN model's total model accuracy in the UCF-101 dataset is 93.78% and 95.70% for the UCF Sport dataset.

Video-text retrieval is a fundamental task in managing the emerging massive amounts of video data. The main challenge focuses on learning a common representation space for videos and queries where the similarity measurement can reflect the semantic closeness. However, existing video-text retrieval models may suffer from the following noise in the common space learning procedure: First, the video-text correspondences in positive pairs may not be exact matches. The crowdsourcing annotation for existing datasets leads to inevitable tagging noise for non-expert annotators. Second, the learning of video-text representation is based on the negative samples randomly sampled. Instances that are semantically similar to the query may be incorrectly categorized as negative samples. To alleviate the adverse impact of these noisy pairs, we propose a novel robust video-text retrieval method that protects the model from noisy positive and negative pairs by identifying and calibrating noisy pairs with their uncertainty score. In particular, we propose a noisy pair identifier, which divides the training dataset into noisy and clean subsets based on the estimated uncertainty of each pair. Then, with the help of uncertainties, we calibrate the two types of noisy pairs with an adaptive margin triplet loss and a weighted triplet loss function, respectively. To verify the effectiveness of our methods, we conduct extensive experiments on three widely used datasets. Experimental results show that the proposed robust video-text retrieval methods successfully identify and calibrate the noisy pairs and improve retrieval performance.

Massive Amounts Of Video Data Research Articles

Articles published on Massive Amounts Of Video Data

Adaptive spatial down-sampling method based on object occupancy distribution for video coding for machines

Improved RNN Model for Real-Time Human Activity Recognition

Fast and accurate visual vibration measurement via derivative-enhanced phase-based optical flow

Edge Intelligence-Assisted Asymmetrical Network Control and Video Decoding in the Industrial IoT with Speculative Parallelization

DeepVQL: Deep Video Queries on PostgreSQL

Robust Video-Text Retrieval Via Noisy Pair Calibration

An NLP-guided ontology development and refinement approach to represent and query visual information

A Video Key Frame Extraction Method Based on Multiview Fusion

Automated Quantification of Brittle Stars in Seabed Imagery Using Computer Vision Techniques.

When Crowd Meets Big Video Data: Cloud-Edge Collaborative Transcoding for Personal Livecast

Cloud-Assisted Multiview Video Summarization Using CNN and Bidirectional LSTM

A New Intelligent Video Surveillance Architecture

A Fast-Iterative Data Association Technique for Multiple Object Tracking

DPPDL: A Dynamic Partial-Parallel Data Layout for Green Video Surveillance Storage

A NEW COMBINATION METHOD FOR ENCRYPTION OF MOVING OBJECTS DETECTION IN VIDEO

Understanding the YouTube partners and their data: Measurement and analysis

Real-Time Fish Observation and Fish Category Database Construction

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Massive Amounts Of Video Data Research Articles

Articles published on Massive Amounts Of Video Data

Adaptive spatial down-sampling method based on object occupancy distribution for video coding for machines

Improved RNN Model for Real-Time Human Activity Recognition

Fast and accurate visual vibration measurement via derivative-enhanced phase-based optical flow

Edge Intelligence-Assisted Asymmetrical Network Control and Video Decoding in the Industrial IoT with Speculative Parallelization

DeepVQL: Deep Video Queries on PostgreSQL

Robust Video-Text Retrieval Via Noisy Pair Calibration

An NLP-guided ontology development and refinement approach to represent and query visual information

A Video Key Frame Extraction Method Based on Multiview Fusion

Automated Quantification of Brittle Stars in Seabed Imagery Using Computer Vision Techniques.

When Crowd Meets Big Video Data: Cloud-Edge Collaborative Transcoding for Personal Livecast

Cloud-Assisted Multiview Video Summarization Using CNN and Bidirectional LSTM

A New Intelligent Video Surveillance Architecture

A Fast-Iterative Data Association Technique for Multiple Object Tracking

DPPDL: A Dynamic Partial-Parallel Data Layout for Green Video Surveillance Storage

A NEW COMBINATION METHOD FOR ENCRYPTION OF MOVING OBJECTS DETECTION IN VIDEO

Understanding the YouTube partners and their data: Measurement and analysis

Real-Time Fish Observation and Fish Category Database Construction