Computer Vision Research Articles

Wildlife population monitoring over large geographic areas is increasingly feasible due to developments in aerial survey methods coupled with the use of computer vision models for identifying and classifying individual organisms. However, aerial surveys still occur infrequently, and there are often long delays between the acquisition of airborne imagery and its conversion into population monitoring data. Near real‐time monitoring is increasingly important for active management decisions and ecological forecasting. Accomplishing this over large scales requires a combination of airborne imagery, computer vision models to process imagery into information on individual organisms, and automated workflows to ensure that imagery is quickly processed into data following acquisition. Here we present our end‐to‐end workflow for conducting near real‐time monitoring of wading birds in the Everglades, Florida, USA. Imagery is acquired as frequently as weekly using uncrewed aircraft systems (aka drones), processed into orthomosaics (using Agisoft metashape), converted into individual‐level species data using a Retinanet‐50 object detector, post‐processed, archived, and presented on a web‐based visualization platform (using Shiny). The main components of the workflow are automated using Snakemake. The underlying computer vision model provides accurate object detection, species classification, and both total and species‐level counts for five out of six target species (White Ibis, Great Egret, Great Blue Heron, Wood Stork, and Roseate Spoonbill). The model performed poorly for Snowy Egrets due to the small number of labels and difficulty distinguishing them from White Ibis (the most abundant species). By automating the post‐survey processing, data on the populations of these species is available in near real‐time (<1 week from the date of the survey) providing information at the time scales needed for ecological forecasting and active management.

In order to improve the target visual recognition and localization accuracy of robotic arms in complex scenes with similar targets, hybrid recognition and localization methods based on an industrial camera and depth camera are proposed. First, according to the speed and accuracy requirements of target recognition and localization, YOLOv5s is introduced as the basic algorithm model for target hybrid recognition and localization. Then, in order to improve the accuracy of target recognition and coarse localization based on an industrial camera (eye-to-hand), the AFPN feature fusion module, simple and parameter-free attention module (SimAM), and soft non-maximum suppression (Soft NMS) are introduced. In order to improve the accuracy of target recognition and fine localization based on a depth camera (eye-in-hand), the SENetV2 backbone network structure, dynamic head module, deformable attention mechanism, and chain-of-thought prompted adaptive enhancer network are introduced. After that, on the basis of constructing a dual camera platform for target hybrid recognition and localization, the hand–eye calibration, collection and production of image datasets required for model training are completed. Finally, for the docking of the oil filling port, the hybrid recognition and localization experimental tests are completed in sequence. The test results show that in target recognition and coarse localization based on the industrial camera, the recognition accuracy of the designed model reaches 99%, and the average localization errors in the horizontal and vertical directions are 2.22 mm and 3.66 mm, respectively. In target recognition and fine localization based on the depth camera, the recognition accuracy of the designed model reaches 98%, and the average errors in depth, horizontal, and vertical directions are 0.12 mm, 0.28 mm, and 0.16 mm, respectively. These not only verify the effectiveness of the target hybrid recognition and localization methods based on dual cameras, but also demonstrate that they meet the high-precision recognition and localization requirements in complex scenes.

Computer Vision Research Articles

Articles published on Computer Vision

Editor’s Note: Special Issue on Advances in Visual Computing

A computer vision based real-time warpage monitoring and detection system in fused deposition modeling

Cultural big data: nineteenth to twenty-first century panoramic visualization

Ensemble Pretrained Convolutional Neural Networks for the Classification of Insulator Surface Conditions

Attention-enhanced multimodal feature fusion network for clothes-changing person re-identification

A New Way to Identify Mastitis in Cows Using Artificial Intelligence

Near real‐time monitoring of wading birds using uncrewed aircraft systems and computer vision

An Efficient Dense Reconstruction Algorithm from LiDAR and Monocular Camera

Research on Target Hybrid Recognition and Localization Methods Based on an Industrial Camera and a Depth Camera in Complex Scenes

Attention is All They Need: Exploring the Media Archaeology of the Computer Vision Research Paper

Advances in Emerging Non-Destructive Technologies for Detecting Raw Egg Freshness: A Comprehensive Review

A review of deep learning-based stereo vision techniques for phenotype feature and behavioral analysis of fish in aquaculture

GETr: A Geometric Equivariant Transformer for Point Cloud Registration

Unveiling Urban River Visual Features Through Immersive Virtual Reality: Analyzing Youth Perceptions with UAV Panoramic Imagery

FusionGCN: Multi-focus image fusion using superpixel features generation GCN and pixel-level feature reconstruction CNN

Artificial intelligence correctly classifies developmental stages of monarch caterpillars enabling better conservation through the use of community science photographs

Constructing a Classification Scheme - and its Consequences: A Field Study of Learning to Label Data for Computer Vision in a Hospital Intensive Care Unit

Enhancing Video Anomaly Detection with Improved UNET and Cascade Sliding Window Technique

Optimizing LLM Strategies for Playing Mendikot using Prompt Engineering

Vision transformers for glioma classification using T1 magnetic resonance imaging

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Computer Vision Research Articles

Articles published on Computer Vision

Editor’s Note: Special Issue on Advances in Visual Computing

A computer vision based real-time warpage monitoring and detection system in fused deposition modeling

Cultural big data: nineteenth to twenty-first century panoramic visualization

Ensemble Pretrained Convolutional Neural Networks for the Classification of Insulator Surface Conditions

Attention-enhanced multimodal feature fusion network for clothes-changing person re-identification

A New Way to Identify Mastitis in Cows Using Artificial Intelligence

Near real‐time monitoring of wading birds using uncrewed aircraft systems and computer vision

An Efficient Dense Reconstruction Algorithm from LiDAR and Monocular Camera

Research on Target Hybrid Recognition and Localization Methods Based on an Industrial Camera and a Depth Camera in Complex Scenes

Attention is All They Need: Exploring the Media Archaeology of the Computer Vision Research Paper

Advances in Emerging Non-Destructive Technologies for Detecting Raw Egg Freshness: A Comprehensive Review

A review of deep learning-based stereo vision techniques for phenotype feature and behavioral analysis of fish in aquaculture

GETr: A Geometric Equivariant Transformer for Point Cloud Registration

Unveiling Urban River Visual Features Through Immersive Virtual Reality: Analyzing Youth Perceptions with UAV Panoramic Imagery

FusionGCN: Multi-focus image fusion using superpixel features generation GCN and pixel-level feature reconstruction CNN

Artificial intelligence correctly classifies developmental stages of monarch caterpillars enabling better conservation through the use of community science photographs

Constructing a Classification Scheme - and its Consequences: A Field Study of Learning to Label Data for Computer Vision in a Hospital Intensive Care Unit

Enhancing Video Anomaly Detection with Improved UNET and Cascade Sliding Window Technique

Optimizing LLM Strategies for Playing Mendikot using Prompt Engineering

Vision transformers for glioma classification using T1 magnetic resonance imaging