Evaluating the Efficacy of Smart Saliency Detection System for Visual Prosthesis Users: An Experimental Comparison Across Various Visual Prosthesis Implants.
Individuals with visual prostheses often struggle to locate specific objects due to limited visual input. Current systems process visual information but fail to effectively highlight or prioritize objects based on user needs. This study investigates a user-centric object highlighting system designed to assist retinal prosthesis users in finding misplaced objects in a controlled experimental setting. The system uses pre-trained multimodal models, allowing users to specify objects through spoken instructions, which are then highlighted in the camera frame. We evaluated the system's performance in a simulated environment with 18 sighted participants acting as virtual patients. We examined how various retinal implant resolutions and stimulation points, from low to high (60-1600 electrodes), impact object recognition. Mixed-effects logistic regression, both frequentist and Bayesian, revealed significant variance in recognition outcomes based on fixed and random effects. High-resolution implants achieved the best and most consistent recognition rates, while lower-resolution implants showed suboptimal recognition. Notably, across all implants, larger objects were recognized more effectively compared to smaller items, indicating that even higher-resolution implants struggle with smaller objects. The characteristics of these objects, particularly size and distinct features, played a crucial role in their recognition performance. This underscores the necessity for effective detection systems tailored to the capabilities of implants, especially those with lower resolution, which lack sufficient detail for independent object recognition. These findings provide valuable insights for enhancing user-centric object highlighting systems and inform the development of real-world testing in complex environments.
- Research Article
494
- 10.1016/j.neuron.2012.04.036
- Jun 1, 2012
- Neuron
A Real-World Size Organization of Object Responses in Occipitotemporal Cortex
- Peer Review Report
- 10.7554/elife.69736.sa1
- Jun 3, 2021
Context-based object recognition causally relies on both scene- and object-selective cortex, with scene-selective cortex generating expectations (at 160-200 ms after onset) that disambiguate object representations in object-selective cortex (at 260-300 ms after onset).
- Research Article
33
- 10.3390/rs12091447
- May 3, 2020
- Remote Sensing
Grassland ecosystems can provide a variety of services for humans, such as carbon storage, food production, crop pollination and pest regulation. However, grasslands are today one of the most endangered ecosystems due to land use change, agricultural intensification, land abandonment as well as climate change. The present study explores the performance of a knowledge-driven GEOgraphic-Object—based Image Analysis (GEOBIA) learning scheme to classify Very High Resolution (VHR) images for natural grassland ecosystem mapping. The classification was applied to a Natura 2000 protected area in Southern Italy. The Food and Agricultural Organization Land Cover Classification System (FAO-LCCS) hierarchical scheme was instantiated in the learning phase of the algorithm. Four multi-temporal WorldView-2 (WV-2) images were classified by combining plant phenology and agricultural practices rules with prior-image spectral knowledge. Drawing on this knowledge, spectral bands and entropy features from one single date (Post Peak of Biomass) were firstly used for multiple-scale image segmentation into Small Objects (SO) and Large Objects (LO). Thereafter, SO were labelled by considering spectral and context-sensitive features from the whole multi-seasonal data set available together with ancillary data. Lastly, the labelled SO were overlaid to LO segments and, in turn, the latter were labelled by adopting FAO-LCCS criteria about the SOs presence dominance in each LO. Ground reference samples were used only for validating the SO and LO output maps. The knowledge driven GEOBIA classifier for SO classification obtained an OA value of 97.35% with an error of 0.04. For LO classification the value was 75.09% with an error of 0.70. At SO scale, grasslands ecosystem was classified with 92.6%, 99.9% and 96.1% of User’s, Producer’s Accuracy and F1-score, respectively. The findings reported indicate that the knowledge-driven approach not only can be applied for (semi)natural grasslands ecosystem mapping in vast and not accessible areas but can also reduce the costs of ground truth data acquisition. The approach used may provide different level of details (small and large objects in the scene) but also indicates how to design and validate local conservation policies.
- Conference Article
709
- 10.5121/csit.2019.91713
- Dec 21, 2019
In the recent years, object detection has experienced impressive progress. Despite these improvements, there is still a significant gap in the performance between the detection of small and large objects. We analyze the current state-of-the-art model, Mask-RCNN, on a challenging dataset, MS COCO. We show that the overlap between small ground-truth objects and the predicted anchors is much lower than the expected IoU threshold. We conjecture this is due to two factors; (1) only a few images are containing small objects, and (2) small objects do not appear enough even within each image containing them. We thus propose to oversample those images with small objects and augment each of those images by copy-pasting small objects many times. It allows us to trade off the quality of the detector on large objects with that on small objects. We evaluate different pasting augmentation strategies, and ultimately, we achieve 9.7% relative improvement on the instance segmentation and 7.1% on the object detection of small objects, compared to the current state of the art method on MS COCO.
- Research Article
16
- 10.1016/0167-5877(95)92833-j
- Sep 1, 1995
- Preventive Veterinary Medicine
Extending the interpretation and utility of mixed effects logistic regression models
- Conference Article
23
- 10.5121/csit.2019.91719
- Dec 21, 2019
Automation testing has become increasingly needed due to the nature of the current software development project which comprises of complex application with shorter development time.Most of the companies in the industry have used Selenium extensively as functional automation tool to verify their web application's functionalities are working as expected.However, for any new project Manual testing is equally important instead of automating.Thus, this research project is about the importance of manual and exploratory testing in industry when our project is under develop stage.
- Research Article
2
- 10.1177/0959651817721404
- Jul 28, 2017
- Proceedings of the Institution of Mechanical Engineers, Part I: Journal of Systems and Control Engineering
In order to improve the performances of object detection and recognition, a two-stage framework combined with variable resolution control strategy is proposed. The images in low resolution and high resolution are employed in object detection and object recognition, respectively. Meanwhile, a feedback mechanism used in two-stage framework is proved to effectively improve the performances of object detection. The results show that under low resolution, the accuracy of object detection based on fixed resolution 320 × 240 without feedback mechanism is 36.7%. However, the accuracy of object detection of the proposed method based on variable resolution increases to 95.3%. Under high resolution, compared with the method based on fixed resolution 1280 × 960 using the two-stage framework, the time consumption of the proposed method with variable resolution decreases by 51.4%, while keeping almost identical recognition accuracy.
- Research Article
30
- 10.1016/j.neuroimage.2021.118098
- Apr 30, 2021
- NeuroImage
The contribution of object size, manipulability, and stability on neural responses to inanimate objects
- Research Article
5
- 10.1167/tvst.7.5.29
- Oct 29, 2018
- Translational vision science & technology
PurposeEfficacy of current visual prostheses in object recognition is limited. Among various limitations to be addressed, such as low resolution and low dynamic range, here we focus on reducing the impact of background clutter on object recognition. We have proposed the use of motion parallax via head-mounted camera lateral scanning and computationally stabilizing the object of interest (OI) to support neural background decluttering. Simulations in head-mounted displays (HMD), mimicking the proposed effect, were used to test object recognition in normally sighted subjects.MethodsImages (24° field of view) were captured from multiple viewpoints and presented at a low resolution (20 × 20). All viewpoints were centered on the OI. Experimental conditions (2 × 3) included clutter (with or without) × head scanning (single viewpoint, 9 coherent viewpoints corresponding to subjects' head positions, and 9 randomly associated viewpoints). Subjects used lateral head movements to view OIs in the HMD. Each object was displayed only once for each subject.ResultsThe median recognition rate without clutter was 40% for all head scanning conditions. Performance with synthetic background clutter dropped to 10% in the static condition, but it was improved to 20% with the coherent and random head scanning (corrected P = 0.005 and P = 0.049, respectively).ConclusionsBackground decluttering using motion parallax cues but not the coherent multiple views of the OI improved object recognition in low-resolution images. The improvement did not fully eliminate the impact of background.Translational RelevanceMotion parallax is an effective but incomplete decluttering solution for object recognition with visual prostheses.
- Research Article
11
- 10.1088/1741-2552/aa966d
- Feb 16, 2018
- Journal of Neural Engineering
Objective. Retinal prosthesis devices have shown great value in restoring some sight for individuals with profoundly impaired vision, but the visual acuity and visual field provided by prostheses greatly limit recipients’ visual experience. In this paper, we employ computer vision approaches to seek to expand the perceptible visual field in patients implanted potentially with a high-density retinal prosthesis while maintaining visual acuity as much as possible. Approach. We propose an optimized content-aware image retargeting method, by introducing salient object detection based on color and intensity-difference contrast, aiming to remap important information of a scene into a small visual field and preserve their original scale as much as possible. It may improve prosthetic recipients’ perceived visual field and aid in performing some visual tasks (e.g. object detection and object recognition). To verify our method, psychophysical experiments, detecting object number and recognizing objects, are conducted under simulated prosthetic vision. As control, we use three other image retargeting techniques, including Cropping, Scaling, and seam-assisted shrinkability. Main results. Results show that our method outperforms in preserving more key features and has significantly higher recognition accuracy in comparison with other three image retargeting methods under the condition of small visual field and low-resolution. Significance. The proposed method is beneficial to expand the perceived visual field of prosthesis recipients and improve their object detection and recognition performance. It suggests that our method may provide an effective option for image processing module in future high-density retinal implants.
- Conference Article
916
- 10.1109/cvpr.2017.211
- Jul 1, 2017
Detecting small objects is notoriously challenging due to their low resolution and noisy representation. Existing object detection pipelines usually detect small objects through learning representations of all the objects at multiple scales. However, the performance gain of such ad hoc architectures is usually limited to pay off the computational cost. In this work, we address the small object detection problem by developing a single architecture that internally lifts representations of small objects to "super-resolved" ones, achieving similar characteristics as large objects and thus more discriminative for detection. For this purpose, we propose a new Perceptual Generative Adversarial Network (Perceptual GAN) model that improves small object detection through narrowing representation difference of small objects from the large ones. Specifically, its generator learns to transfer perceived poor representations of the small objects to super-resolved ones that are similar enough to real large objects to fool a competing discriminator. Meanwhile its discriminator competes with the generator to identify the generated representation and imposes an additional perceptual requirement - generated representations of small objects must be beneficial for detection purpose - on the generator. Extensive evaluations on the challenging Tsinghua-Tencent 100K and the Caltech benchmark well demonstrate the superiority of Perceptual GAN in detecting small objects, including traffic signs and pedestrians, over well-established state-of-the-arts.
- Book Chapter
- 10.1007/978-3-030-84522-3_39
- Jan 1, 2021
In recent years, object recognition has experienced impressive progress. Despite these improvements, there is still a significant gap in the performance between the detection of small and large objects. We analyze that the limitation of existing algorithms for small target detection, such as: (1) the high computational overhead of image resolution increase and (2) the non-semantic data augmentation of small-object- copy-based strategy, leading a worse result in mAP. So, we figure out that the limited number of semantic training samples is a key impediment for this task due to the high cost of collecting and labelling nature images. In this paper, we propose a simple but effective framework for small object recognition. With an improved generative model, we propose a multiply instance learning detector based on CNN, which jointly learns from the labeled nature datasets and unlabeled generated images. Our method shows a state-of-the-art performance for small objects, obtained by Mask R-CNN, on MS COCO.
- Conference Article
6
- 10.1109/iccvw.2019.00143
- Oct 1, 2019
Objects are naturally captured over a continuous range of distances, causing dramatic changes in appearance, especially at low resolutions. Recognizing such small objects at range is an open challenge in object recognition. In this paper, we explore solutions to this problem by tackling the fine-grained task of face recognition. State-of-the-art embeddings aim to be scale-invariant by extracting representations in a canonical coordinate frame (by resizing a face window to a resolution of say, 224x224 pixels). However, it is well known in the psychophysics literature that human vision is decidedly scale variant: humans are much less accurate at lower resolutions. Motivated by this, we explore scale-variant multiresolution embeddings that explicitly disentangle factors of variation across resolution and scale. Importantly, multiresolution embeddings can adapt in size and complexity to the resolution of input image on-the-fly (e.g., high resolution input images produce more detailed representations that result in better recognition performance). Compared to state-of-the-art one-size-fits-all approaches, our embeddings dramatically reduce error for small faces by at least 70% on standard benchmarks (i.e. IJBC, LFW and MegaFace).
- Research Article
27
- 10.1016/j.fishres.2023.106710
- Apr 12, 2023
- Fisheries Research
SO-YOLOv5: Small object recognition algorithm for sea cucumber in complex seabed environment
- Conference Article
8
- 10.1109/icftic57696.2022.10075150
- Dec 2, 2022
Despite the great success of general-purpose object detectors in recent years, the detection performance for small objects is not yet satisfactory. Since small objects and large objects have different sensory field sensitivity and small objects can extract fewer features, there is still great potential for development in the field of small object detection. In this paper, we propose an improved algorithm based on Faster-RCNN for dense small object detection in complex environments. Slicing Aided Hyper Inference (SAHI) is used to improve the small object detection task. To optimize accuracy and computational effort, convolution is replaced by an involution operator and is integrated into the backbone of the network. Then, a novel label assignment strategy based on gaussian receptive field is introduced in the Faster-RCNN algorithm to better assign labels. Experiments were conducted on the VisDrone-DET2019 dataset, and the results show that the proposed algorithm is quite an improvement over the original one. The precision and recall of small objects are improved to 24.6% and 38.3%, respectively.