Simultaneous Object Detection Research Articles

Real-time tumor tracking is one motion management method to address motion-induced uncertainty. To date, fiducial markers are often required to reliably track lung tumors with X-ray imaging, which carries risks of complications and leads to prolonged treatment time. A markerless tracking approach is thus desirable. Deep learning-based approaches have shown promise for markerless tracking, but systematic evaluation and procedures to investigate applicability in individual cases are missing. Moreover, few efforts have been made to provide bounding box prediction and mask segmentation simultaneously, which could allow either rigid or deformable multi-leaf collimatortracking. The purpose of this study was to implement a deep learning-based markerless lung tumor tracking model exploiting patient-specific training which outputs both a bounding box and a mask segmentation simultaneously. We also aimed to compare the two kinds of predictions and to implement a specific procedure to understand the feasibility of markerless tracking on individualcases. We first trained a Retina U-Net baseline model on digitally reconstructed radiographs (DRRs) generated from a public dataset containing 875 CT scans and corresponding lung nodule annotations. Afterwards, we used an independent cohort of 97 lung patients to develop a patient-specific refinement procedure. In order to determine the optimal hyperparameters for automatic patient-specific training, we selected 13 patients for validation where the baseline model predicted a bounding box on planning CT (PCT)-DRR with intersection over union (IoU) with the ground-truth higher than 0.7. The final test set contained the remaining 84 patients with varying PCT-DRR IoU. For each testing patient, the baseline model was refined on the PCT-DRR to generate a patient-specific model, which was then tested on a separate 10-phase 4DCT-DRR to mimic the intrafraction motion during treatment. A template matching algorithm served as benchmark model. The testing results were evaluated by four metrics: the center of mass (COM) error and the Dice similarity coefficient (DSC) for segmentation masks, and the center of box (COB) error and the DSC for bounding box detections. Performance was compared to the benchmark model including statistical testing forsignificance. A PCT-DRR IoU value of 0.2 was shown to be the threshold dividing inconsistent (68%) and consistent (100%) success (defined as mean bounding box DSC > 0.6) of PS models on 4DCT-DRRs. Thirty-seven out of the eighty-four testing cases had a PCT-DRR IoU above 0.2. For these 37 cases, the mean COM error was 2.6 mm, the mean segmentation DSC was 0.78, the mean COB error was 2.7 mm, and the mean box DSC was 0.83. Including the validation cases, the model was applicable to 50 out of 97 patients when using the PCT-DRR IoU threshold of 0.2. The inference time per frame was 170 ms. The model outperformed the benchmark model on all metrics, and the comparison was significant (p < 0.001) over the 37 PCT-DRR IoU > 0.2 cases, but not over the undifferentiated 84 testingcases. The implemented patient-specific refinement approach based on a pre-trained baseline model was shown to be applicable to markerless tumor tracking in simulated radiographs for lungcases.

Read full abstract

While 3D object-centered shape-based models are appealing in comparison with 2D viewer-centered appearance-based models for their lower model complexities and potentially better view generalizabilities, the learning and inference of 3D models has been much less studied in the recent literature due to two factors: i) the enormous complexities of 3D shapes in geometric space; and ii) the gap between 3D shapes and their appearances in images. This paper aims at tackling the two problems by studying an And-Or Tree (AoT) representation that consists of two parts: i) a geometry-AoT quantizing the geometry space, i.e. the possible compositions of 3D volumetric parts and 2D surfaces within the volumes; and ii) an appearance-AoT quantizing the appearance space, i.e. the appearance variations of those shapes in different views. In this AoT, an And-node decomposes an entity into constituent parts, and an Or-node represents alternative ways of decompositions. Thus it can express a combinatorial number of geometry and appearance configurations through small dictionaries of 3D shape primitives and 2D image primitives. In the quantized space, the problem of learning a 3D object template is transformed to a structure search problem which can be efficiently solved in a dynamic programming algorithm by maximizing the information gain. We focus on learning 3D car templates from the AoT and collect a new car dataset featuring more diverse views. The learned car templates integrate both the shape-based model and the appearance-based model to combine the benefits of both. In experiments, we show three aspects: 1) the AoT is more efficient than the frequently used octree method in space representation; 2) the learned 3D car template matches the state-of-the art performances on car detection and pose estimation in a public multi-view car dataset; and 3) in our new dataset, the learned 3D template solves the joint task of simultaneous object detection, pose/view estimation, and part localization. It can generalize over unseen views and performs better than the version5 of the DPM model in terms of object detection and semantic part localization.

Read full abstract

Simultaneous Object Detection Research Articles

Articles published on Simultaneous Object Detection

Simultaneous Object Detection and Distance Estimation for Indoor Autonomous Vehicles

Multi-object 3D segmentation of brain structures using a geometric deformable model with a priori knowledge

Simultaneous object detection and segmentation for patient-specific markerless lung tumor tracking in simulated radiographs with deep learning.

A neural learning approach for simultaneous object detection and grasp detection in cluttered scenes.

Simultaneous Segmentation and Classification of Pressure Injury Image Data Using Mask-R-CNN.

A comparison of deep learning algorithms on image data for detecting floodwater on roadways

A loss-balanced multi-task model for simultaneous detection and segmentation

Performance Evaluation of Edge Orientation Histograms Based System for Real-time Object Detection in Two Separate Platforms

Boosting Multi-Vehicle Tracking with a Joint Object Detection and Viewpoint Estimation Sensor.

The challenge of simultaneous object detection and pose estimation: A comparative study

Learning 3D Object Templates by Quantizing Geometry and Appearance Spaces.

Multi-robot system using low-cost infrared sensors

GEOMETRIC INVARIANTS CONSTRUCTION FOR SEMANTIC SCENE UNDERSTANDING FROM MULTIPLE VIEWS INSPIRED BY THE HUMAN VISUAL SYSTEM

Adaptive background generation for automatic detection of initial object region in multiple color-filter aperture camera-based surveillance system

Adaptive object detection and recognition based on a feedback strategy

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Simultaneous Object Detection Research Articles

Articles published on Simultaneous Object Detection

Simultaneous Object Detection and Distance Estimation for Indoor Autonomous Vehicles

Multi-object 3D segmentation of brain structures using a geometric deformable model with a priori knowledge

Simultaneous object detection and segmentation for patient-specific markerless lung tumor tracking in simulated radiographs with deep learning.

A neural learning approach for simultaneous object detection and grasp detection in cluttered scenes.

Simultaneous Segmentation and Classification of Pressure Injury Image Data Using Mask-R-CNN.

A comparison of deep learning algorithms on image data for detecting floodwater on roadways

A loss-balanced multi-task model for simultaneous detection and segmentation

Performance Evaluation of Edge Orientation Histograms Based System for Real-time Object Detection in Two Separate Platforms

Boosting Multi-Vehicle Tracking with a Joint Object Detection and Viewpoint Estimation Sensor.

The challenge of simultaneous object detection and pose estimation: A comparative study

Learning 3D Object Templates by Quantizing Geometry and Appearance Spaces.

Multi-robot system using low-cost infrared sensors

GEOMETRIC INVARIANTS CONSTRUCTION FOR SEMANTIC SCENE UNDERSTANDING FROM MULTIPLE VIEWS INSPIRED BY THE HUMAN VISUAL SYSTEM

Adaptive background generation for automatic detection of initial object region in multiple color-filter aperture camera-based surveillance system

Adaptive object detection and recognition based on a feedback strategy