Articles published on Crowd counting
Authors
Select Authors
Journals
Select Journals
Duration
Select Duration
647 Search results
Sort by Recency
- New
- Research Article
- 10.3390/s26010333
- Jan 4, 2026
- Sensors (Basel, Switzerland)
- Kai Zhao + 3 more
Crowd counting is a critical computer vision task with significant applications in public security and smart city systems. While deep learning has markedly improved accuracy, persistent challenges include extreme scale variations, severe occlusion, and complex background clutter. To address these issues, we propose a novel Hybrid Multi-Scale Transformer-CNN U-shaped Network (HMSTUNet). Our key contributions are: a hybrid architecture integrating a Multi-Scale Vision Transformer (MSViT) for capturing long-range dependencies and a Dynamic Convolutional Attention Block (DCAB) for modeling local density patterns; and a U-shaped encoder–decoder with skip connections for effective multi-level feature fusion. Extensive evaluations on five public benchmarks show that HMSTUNet achieves the best Mean Absolute Error (MAE) on all five datasets and the best Mean Squared Error (MSE) on three. It sets new state-of-the-art records, attaining MAE/MSE of 49.1/77.8 on SHA, 6.2/10.3 on SHB, 142.1/192.7 on UCF_CC_50, 77.9/132.5 on UCF-QNRF, and 43.2/119.6 on NWPU-Crowd. These results demonstrate the model’s strong robustness and generalization capability.
- New
- Research Article
- 10.1016/j.eswa.2025.129041
- Jan 1, 2026
- Expert Systems with Applications
- Yiru Du + 3 more
BoostCount: Diffusion-based position-sensitive adversarial purification for crowd counting
- New
- Research Article
- 10.1007/s00371-025-04238-4
- Jan 1, 2026
- The Visual Computer
- Yubo Yang + 4 more
DFMNet: deep fusion mamba network for multimodal crowd counting
- New
- Research Article
- 10.1016/j.image.2025.117423
- Jan 1, 2026
- Signal Processing: Image Communication
- Chongle Peng + 4 more
MTDNet: A crowd counting network based on a multiscale transformer and dilated convolution
- Research Article
- 10.3390/app16010161
- Dec 23, 2025
- Applied Sciences
- Jian Liu + 3 more
Crowd counting is a significant task in computer vision. By combining the rich texture information from RGB images with the insensitivity to illumination changes offered by thermal imaging, the applicability of models in real-world complex scenarios can be enhanced. Current research on RGB-T crowd counting primarily focuses on feature fusion strategies, multi-scale structures, and the exploration of novel network architectures such as Vision Transformer and Mamba. However, existing approaches face two key challenges: limited robustness to illumination shifts and insufficient handling of scale discrepancies. To address these challenges, this study aims to develop a robust RGB-T crowd counting framework that remains stable under illumination shifts, through introduces two key innovations beyond existing fusion and multi-scale approaches: (1) a cross-modal adaptive fusion module (CMAFM) that actively evaluates and fuses reliable cross-modal features under varying scenarios by simulating a dynamic feature selection and trust allocation mechanism; and (2) a multi-scale aggregation module (MSAM) that unifies features with different receptive fields to an intermediate scale and performs weighted fusion to enhance modeling capability for cross-modal scale variations. The proposed method achieves relative improvements of 1.57% in GAME(0) and 0.78% in RMSE on the DroneRGBT dataset compared to existing methods, and improvements of 2.48% and 1.59% on the RGBT-CC dataset, respectively. It also demonstrates higher stability and robustness under varying lighting conditions. This research provides an effective solution for building stable and reliable all-weather crowd counting systems, with significant application prospects in smart city security and management.
- Research Article
- 10.3390/app152413211
- Dec 17, 2025
- Applied Sciences
- Junzhe Mao + 4 more
Counting small, densely clustered objects from low-altitude aerial views is challenging due to large scale variations, complex backgrounds, and severe occlusion, which often degrade the performance of fully supervised or density-regression methods. To address these issues, we propose a weakly supervised crowd counting framework that leverages point-level supervision and a feature-adaptive fusion strategy to enhance perception under low-altitude aerial views. The network comprises a front-end feature extractor and a back-end fusion module. The front-end adopts the first 13 convolutional layers of VGG16-BN to capture multi-scale semantic features while preserving crucial spatial details. The back-end integrates a Feature-Adaptive Fusion module and a Multi-Scale Feature Aggregation module: the former dynamically adjusts fusion weights across scales to improve robustness to scale variation, and the latter aggregates multi-scale representations to better capture targets in dense, complex scenes. Point-level annotations serve as weak supervision to substantially reduce labeling cost while enabling accurate localization of small individual instances. Experiments on several public datasets, including ShanghaiTech Part A, ShanghaiTech Part B, and UCF_CC_50, demonstrate that our method surpasses existing mainstream approaches, effectively mitigating scale variation, background clutter, and occlusion, and providing an efficient and scalable weakly supervised solution for small-object counting.
- Research Article
- 10.1038/s41598-025-30387-6
- Dec 1, 2025
- Scientific reports
- Yuri Shendryk + 1 more
Mangrove ecosystems are the focus of extensive conservation and restoration efforts due to their critical roles in coastal protection, carbon sequestration, and climate change mitigation. Detecting mangrove seedlings in remotely sensed imagery is essential for evaluating restoration success and prioritizing conservation areas. While seedling detection research has predominantly been conducted in agricultural settings, studies in natural environments remain limited, and no dedicated methods have been proposed for mangrove ecosystems. To address this gap, this study develops a model for detecting mangrove seedlings in ultra-high-resolution UAV imagery (0.85cm) across 22 seeding sites in the Emirate of Abu Dhabi, United Arab Emirates. Building on the success of deep learning in crowd counting tasks, the seedling detection model was developed through a two-stage process. In the first stage, Gaussian blurring was applied to seedling locations to generate a density map, which was then predicted from UAV images using an encoder-decoder MaxViT-UNet architecture. In the second stage, the predicted density map was further thresholded using a Difference of Gaussians method to accurately localize individual seedlings. The model achieved variable performance, with a peak F1-score of 0.70 on the validation dataset (precision: 0.65, recall: 0.76). The developed model was also benchmarked against ResNet-DETR, a state-of-the-art object detection framework, and achieved 9% improvement in F1-score. While the developed model shows promising performance in identifying mangrove seedlings, challenges remain, such as labelling inaccuracies, potential inconsistencies in UAV imagery over time, and inherent limitations associated with deep learning methods. Nevertheless, this study highlights the potential of UAV-based deep learning models for accurately detecting mangrove seedlings at scale, providing a powerful tool to support restoration monitoring and inform more effective conservation strategies in mangrove ecosystems.
- Research Article
2
- 10.1016/j.patcog.2025.111832
- Dec 1, 2025
- Pattern Recognition
- Shenjian Gong + 5 more
Spatially adaptive pyramid feature fusion for scale-aware crowd counting
- Research Article
- 10.1007/s11760-025-04919-6
- Nov 10, 2025
- Signal, Image and Video Processing
- Shihui Zhang + 4 more
CMFNet: Cross-attention multi-scale fusion network for cross-modal crowd counting
- Research Article
1
- 10.1016/j.patrec.2025.08.005
- Nov 1, 2025
- Pattern Recognition Letters
- Zhanqiang Huo + 4 more
VMamba-Crowd: Bridging multi-scale features from Visual Mamba for weakly-supervised crowd counting
- Research Article
- 10.1016/j.patcog.2025.111709
- Nov 1, 2025
- Pattern Recognition
- Miaogen Ling + 4 more
Dual-branch adjacent connection and channel mixing network for video crowd counting
- Research Article
- 10.1016/j.imavis.2025.105750
- Nov 1, 2025
- Image and Vision Computing
- Lifang Zhou + 1 more
Enhanced crowd counting with weighted attention network and multi-scale feature integration
- Research Article
- 10.1049/icp.2025.2895
- Oct 1, 2025
- IET Conference Proceedings
- Jialin Xie + 1 more
Enhancing PET for crowd counting with adaptive thresholding and polar kernel convolution
- Research Article
- 10.1016/j.comcom.2025.108245
- Sep 1, 2025
- Computer Communications
- Beiming Yan + 6 more
Crowd counting with WiFi sensing based on iterative attentional feature fusion
- Research Article
- 10.1038/s41598-025-14056-2
- Aug 19, 2025
- Scientific Reports
- Sabri Boughorbel + 6 more
Object counting can be formulated as a density estimation task using point-annotated images. Although such labeling is cost-effective, trained models can be sensitive to annotation noise. In this paper, we propose a method called DUMLO (Distribution Uncertainty Matching for Loss Optimization) that defines a loss function between a ground-truth density map and a target density map by modeling uncertainty over an augmented set of points. DUMLO formulates the loss function as a coupling between two optimal transport problems, which involves an unknown density map defined over the augmented points. To solve the problem, we propose a new algorithm, called Trihorn, which jointly estimates the loss function and the density map of the augmentation set. The latter can be interpreted as a measure of the uncertainty associated with the annotations. We provide a theoretical analysis and show that the generalization error bound of the proposed loss is tight. We extensively evaluate our model on benchmark datasets from three real-world applications: pathology cell counting, crowd counting and Vehicle Images Datasets. Our results demonstrate that the proposed model achieves good performance in terms of Mean Absolute Error and is robust to annotation noise while exhibiting a fast convergence property.Supplementary Information The online version contains supplementary material available at10.1038/s41598-025-14056-2.
- Research Article
- 10.1007/s10586-025-05226-y
- Aug 14, 2025
- Cluster Computing
- Jing-An Cheng + 5 more
Towards trustworthy crowd counting by distillation hierarchical mixture of experts for edge-based cluster computing
- Research Article
- 10.1016/j.eswa.2025.128023
- Aug 1, 2025
- Expert Systems with Applications
- Jienan Shen + 3 more
PromptHC: Multi-attention prompt guided haze-weather crowd counting
- Research Article
- 10.1016/j.atech.2025.100963
- Aug 1, 2025
- Smart Agricultural Technology
- Dianzhuo Zhou + 4 more
Line-labelling enhanced CNNs for transparent juvenile fish crowd counting
- Research Article
- 10.1016/j.neucom.2025.130304
- Aug 1, 2025
- Neurocomputing
- Ankit Tomar + 2 more
EDCCN: A benchmark encoder-decoder framework for accurate crowd counting
- Research Article
- 10.1007/s00521-025-11426-9
- Jul 30, 2025
- Neural Computing and Applications
- Heba F Elsepae + 3 more
Novel approach for crowd counting combining VGG16 and efficientnetb7 for optimal performance in harsh weather