Task Of Pose Estimation Research Articles

Local feature matching involves the task of establishing the pixel-wise correspondences between a pair of images. As an integral component of plentiful computer vision applications (e.g., visual localization), this task has been successfully performed using Transformer-based methods. However, these methods typically extract numerous keypoints from sparse texture regions to construct a densely-connected graph neural network (GNN) for long-range feature aggregation, which inevitably triggers redundant message exchange and hampers the learning process. Furthermore, they employ transformer encoder layers that consider images as 1D sequences, leaving them incapable of extracting multiscale local structural information from the images, which is critical for establishing correspondence in image pairs with significantly scales shifts. In this study, we develop FMAP, an innovative detector-free approach that enables accurate local feature matching. For the first issue, FMAP employs an anchor points feature aggregation module (APAM) that captures representative keypoints and discards the extraneous keypoints to build a sparsified GNN for compact yet clean message exchange, with the key insight that the keypoints containing abundant visual information are distinguishable from their neighbors. For the second issue, FMAP proposes a global–local multiscale perception module (GMPM), which incorporates abundant multiscale local context information into global feature representation by employing multiple depth-wise convolutions with varying kernel sizes, thereby generating discriminative features that are robust to scale shifts. In addition, the depth-wise convolutions are utilized in the feed-forward network of the GMPM to further fuse the global context information and local feature representation. Extensive experiments on several standard benchmarks demonstrate that the proposed FMAP method significantly outperforms state-of-the-art methods. Compared to the cutting-edge methods MatchFormer, QuadTree, and TopicFM in relative pose estimation task, FMAP surpasses them by 2.27%, 0.58%, and 1.08% in terms of AUC@5°. Besides, FMAP noticeably outperforms the baseline LoFTR by (2.38%,1.89%,1.45%) in terms of AUC@(5°, 10°, 20°). Moreover, we integrate FMAP into an official visual localization framework and conduct a visual localization experiment, with the results showing that FMAP exceeds LoFTR by 2.3% in terms of AP.

General movement assessment (GMA) of infant movement videos (IMVs) is an effective method for early detection of cerebral palsy (CP) in infants. We demonstrate in this paper that end-to-end trainable neural networks for image sequence recognition can be applied to achieve good results in GMA, and more importantly, augmenting raw video with infant body parsing and pose estimation information can significantly improve performance. To solve the problem of efficiently utilizing partially labeled IMVs for body parsing, we propose a semi-supervised model, termed SiamParseNet (SPN), which consists of two branches, one for intra-frame body parts segmentation and another for inter-frame label propagation. During training, the two branches are jointly trained by alternating between using input pairs of only labeled frames and input of both labeled and unlabeled frames. We also investigate training data augmentation by proposing a factorized video generative adversarial network (FVGAN) to synthesize novel labeled frames for training. FVGAN decouples foreground and background generation which allows for generating multiple labeled frames from one real labeled frame. When testing, we employ a multi-source inference mechanism, where the final result for a test frame is either obtained via the segmentation branch or via propagation from a nearby key frame. We conduct extensive experiments for body parsing using SPN on two infant movement video datasets; on these partially labeled IMVs, we show that SPN coupled with FVGAN achieves state-of-the-art performance. We further demonstrate that our proposed SPN can be easily adapted to the infant pose estimation task with superior performance. Last but not least, we explore the clinical application of our method for GMA. We collected a new clinical IMV dataset with GMA annotations, and our experiments show that our SPN models for body parsing and pose estimation trained on the first two datasets generalize well to the new clinical dataset and their results can significantly boost the convolutional recurrent neural network (CRNN) based GMA prediction performance when combined with raw video inputs.

Task Of Pose Estimation Research Articles

Related Topics

Articles published on Task Of Pose Estimation

Enhancing 6-DoF Object Pose Estimation through Multiple Modality Fusion: A Hybrid CNN Architecture with Cross-Layer and Cross-Modal Integration

DSPose: Dual-Space-Driven Keypoint Topology Modeling for Human Pose Estimation.

FMAP: Learning robust and accurate local feature matching with anchor points

DUA: A Domain-Unified Approach for Cross-Dataset 3D Human Pose Estimation.

A 3D pose estimation framework for preterm infants hospitalized in the Neonatal Unit

Motion-aware and data-independent model based multi-view 3D pose refinement for volleyball spike analysis

A Monocular SLAM System Based on ResNet Depth Estimation

Multi-Angle Models and Lightweight Unbiased Decoding-Based Algorithm for Human Pose Estimation

PyCaNet: Pose Estimation of Primary and Middle School Students Using Pyramid Convolutions and Coordinate Attention

An improved lightweight high-resolution network based on multi-dimensional weighting for human pose estimation

Ensemble of 6 DoF Pose estimation from state-of-the-art deep methods.

Dynamic Vehicle Pose Estimation with Heuristic L-Shape Fitting and Grid-Based Particle Filter

Pose estimation and motion analysis of ski jumpers based on ECA-HRNet

Full-BAPose: Bottom Up Framework for Full Body Pose Estimation

A Pose-Aware Global Representation Network for Human Parsing

MMDA: Multi‐person marginal distribution awareness for monocular 3D pose estimation

Learning to Reduce Scale Differences for Large-Scale Invariant Image Matching

Animal Pose Estimation Algorithm Based on the Lightweight Stacked Hourglass Network

Single Person Dense Pose Estimation via Geometric Equivariance Consistency

Semi-supervised body parsing and pose estimation for enhancing infant general movement assessment.

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Task Of Pose Estimation Research Articles

Related Topics

Articles published on Task Of Pose Estimation

Enhancing 6-DoF Object Pose Estimation through Multiple Modality Fusion: A Hybrid CNN Architecture with Cross-Layer and Cross-Modal Integration

DSPose: Dual-Space-Driven Keypoint Topology Modeling for Human Pose Estimation.

FMAP: Learning robust and accurate local feature matching with anchor points

DUA: A Domain-Unified Approach for Cross-Dataset 3D Human Pose Estimation.

A 3D pose estimation framework for preterm infants hospitalized in the Neonatal Unit

Motion-aware and data-independent model based multi-view 3D pose refinement for volleyball spike analysis

A Monocular SLAM System Based on ResNet Depth Estimation

Multi-Angle Models and Lightweight Unbiased Decoding-Based Algorithm for Human Pose Estimation

PyCaNet: Pose Estimation of Primary and Middle School Students Using Pyramid Convolutions and Coordinate Attention

An improved lightweight high-resolution network based on multi-dimensional weighting for human pose estimation

Ensemble of 6 DoF Pose estimation from state-of-the-art deep methods.

Dynamic Vehicle Pose Estimation with Heuristic L-Shape Fitting and Grid-Based Particle Filter

Pose estimation and motion analysis of ski jumpers based on ECA-HRNet

Full-BAPose: Bottom Up Framework for Full Body Pose Estimation

A Pose-Aware Global Representation Network for Human Parsing

MMDA: Multi‐person marginal distribution awareness for monocular 3D pose estimation

Learning to Reduce Scale Differences for Large-Scale Invariant Image Matching

Animal Pose Estimation Algorithm Based on the Lightweight Stacked Hourglass Network

Single Person Dense Pose Estimation via Geometric Equivariance Consistency

Semi-supervised body parsing and pose estimation for enhancing infant general movement assessment.