Articles published on Field of view
Authors
Select Authors
Journals
Select Journals
Duration
Select Duration
6886 Search results
Sort by Recency
- New
- Research Article
- 10.1016/j.mri.2026.110617
- Jun 1, 2026
- Magnetic resonance imaging
- Zhiyuan Li + 16 more
An incomplete field-of-view (FOV) in diffusion magnetic resonance imaging (dMRI) can severely hinder the volumetric and bundle analyses of whole-brain white matter connectivity. Although existing works have investigated imputing the missing regions using deep generative models, it remains unclear how to specifically utilize additional information from paired multi-modality data and whether this can enhance the imputation quality and be useful for downstream tractography. To fill this gap, we propose a novel framework for imputing dMRI scans in the incomplete part of the FOV by integrating the learned diffusion features in the acquired part of the FOV to the complete brain anatomical structure. We hypothesize that by this design the proposed framework can enhance the imputation performance of the dMRI scans and therefore be useful for repairing whole-brain tractography in corrupted dMRI scans with incomplete FOV. We tested our framework on two cohorts from different sites with a total of 96 subjects and compared it with a baseline imputation method that treats the information from T1w and dMRI scans equally. The proposed framework achieved significant improvements in imputation performance, as demonstrated by angular correlation coefficient (p<1E-5), and in downstream tractography accuracy, as demonstrated by Dice score (p<0.01). Results suggest that the proposed framework improved imputation performance in dMRI scans by specifically utilizing additional information from paired multi-modality data, compared with the baseline method. The imputation achieved by the proposed framework enhances whole brain tractography, and therefore reduces the uncertainty when analyzing bundles associated with neurodegenerative.
- New
- Research Article
- 10.1016/j.identj.2026.109438
- Jun 1, 2026
- International dental journal
- David Macdonald + 4 more
Nasopharynx on Cone-beam Computed Tomography.
- New
- Research Article
- 10.1002/mrm.70302
- Jun 1, 2026
- Magnetic resonance in medicine
- Joshua Marchant + 3 more
To evaluate the use of the Model Predictive Filtering (MPF) method to improve temporal resolution of magnetic resonance temperature imaging (MRTI) for monitoring laser interstitial thermal therapy (LITT) ablations. Using a Green's function method for solving differential equations, a treatment-specific power matrix Q was derived from a LITT heating and used in the Pennes bioheat equation (PBHE) to model a subsequent higher power heating and supplement subsampled k-space data. This MPF method was evaluated using both 3D segmented EPI data and a tissue mimicking phantom and clinical LITT treatment data after retrospective subsampling. Reconstruction accuracy was assessed via thermal dose and analysis of the hottest voxel and region-of-voxels over time. In the phantom data, temporal resolution equivalent to a 12-slice acquisition was produced with larger fields-of-view (24 and 36 slices, R = 2 and 3) with good hottest voxel-over-time accuracy and 240 CEM43 volume agreement (Dice similarity coefficient, DSC 0.7). In the in vivo data, MPF reconstruction showed excellent 240 CEM43 volume agreement for both orthogonal slices (DSC 0.9 for R = 2 and 3). The sagittal and coronal slices showed excellent hottest voxel accuracy for subsampling of R 3, with an RMSE ≤ 1°C. Hottest voxel RMSE remained within 1°C-3°C up to a subsampling factor of 5. The MPF algorithm allowed for large field-of-view (FOV) volumetric temperature imaging without decreasing temporal resolution in phantom heatings. Bi-planar clinical treatment data reconstruction showed good accuracy for the application of MPF to in vivo data.
- New
- Research Article
- 10.1016/j.ejrad.2026.112762
- Jun 1, 2026
- European journal of radiology
- Ulf Bach + 8 more
While cone beam CT (CBCT) is commonly used in musculoskeletal imaging of the extremities, its application in spinal imaging has been restricted by small field-of-view (FOV) coverage. Recent advancements in gantry-based CBCT systems promise to enable comprehensive imaging of the spinal column. This study aimed to evaluate the performance of a novel gantry-based, multi-scan CBCT system for spinal imaging with complete anatomic coverage and compare it to energy integrating (EI)CT and photon counting (PC)CT using dose-matched protocols. An anthropomorphic torso phantom was used to simulate human anatomy. Gantry-based CBCT scans of the thoracolumbar spine were performed using different presets (low-dose, enhanced, best quality), while EICT and PCCT scans followed dose-matched clinical protocols. Qualitative image analysis was assessed by three blinded readers using a 4-point Likert scale, and quantitative analysis was conducted using global noise level (GNL) measurements. CBCT achieved diagnostic-quality imaging for the thoracic and lumbar spine, particularly with "best" and "enhanced" presets. Subjective image quality was highest for PCCT, followed by EICT and CBCT. CBCT demonstrated lower GNL than EICT, nearing PCCT levels. However, high radiation doses (5mGy) were required for CBCT imaging of the upper thoracic spine (Th1-Th6) due to anatomical complexity, while low doses (0,5 mGy) sufficed for the lower thoracolumbar spine (Th7-S1). Gantry-based CBCT was able to generate diagnostic-quality images of large spinal regions at relatively low radiation doses in a phantom setting, although the upper thoracic spine (above Th6) required higher doses. The overall subjective image quality remained below EICT and PCCT.
- New
- Research Article
- 10.1038/s41377-026-02291-9
- May 18, 2026
- Light, Science & Applications
- Pengyinjie Lyu + 1 more
Head-mounted displays (HMDs) based on the well-established rectilinear sampling method are subject to the inherent trade-off between wide field of view (FOV) and high spatial resolution. This challenge limits their broader application due to constraints in manufacturing high-resolution displays and the substantial data bandwidth required for rendering, storage, and transmission. Foveated display technology alleviates this issue by allocating resources differently between the region of interest and the peripheral region. However, most existing solutions rely on dynamic dual-resolution schemes that are costly and complex, requiring multiple displays or optical paths, two-dimensional steering mechanisms, and eye-tracking systems. We propose and demonstrate a perception-driven approach to the design of a three-element freeform eyepiece featuring spatially varying optical power. The novel eyepiece enables the creation of a statically foveated optical see-through HMD, yielding a display of an 80° diagonal FOV and a peak resolution density of 60 pixels per degree with a 4 K display panel. The system offers high perceived resolution across the FOV with imperceptible or minimal degradation and resolution discontinuity with eye movements. Our approach eliminates the need for eye tracking, scanning mechanisms, or multiple displays, significantly reducing hardware complexity. Compared to the rectilinear sampling scheme offering the same peak resolution density and FOV, our system reduces pixel usage by more than 35% or equivalently 4.4 million fewer pixels.
- New
- Research Article
- 10.35848/1347-4065/ae62c4
- May 13, 2026
- Japanese Journal of Applied Physics
- Keisuke Hirotani + 3 more
Abstract To improve the performance of silicon-photonics-based light detection and ranging (LiDAR), we investigated the optimal design and experimental performance of slow-light grating (SLG), a nonmechanical beam scanner integrated into LiDAR system. The SLG structure was topologically optimized via covariance matrix adaptation evolution strategy to suppress beam divergence, sidelobes, and their wavelength dependence, while enhancing the field of view (FOV) and output efficiency. We identified SLG designs that improve specific performance while maintaining the others, which was also confirmed experimentally. A symmetric SLG, particularly suitable for small footprint and simple control, achieves the widest FOV of 56°, representing 1.4-fold increase over conventional SLGs, while suppressing beam divergence. An asymmetric SLG, suitable for high efficiency, enhances the radiation efficiency by 4.5 dB and LiDAR signal intensity by >10 dB, while maintaining a wide FOV of 50°. These improvements will relax the requirements for beam-collimating lens and enhance the LiDAR sensitivity.
- New
- Research Article
- 10.1088/1674-4527/ae5f70
- May 12, 2026
- Research in Astronomy and Astrophysics
- Xupiao Yang + 8 more
Abstract The rapid expansion of low-Earth-orbit (LEO) megaconstellations introduces new risks to radio astronomy from unintended electromagnetic radiation (UEMR). In this work, we present an attempt to search for UEMR from Starlink satellites using the 21 Centimeter Array (21CMA). Because the sensitivity of a single pod observation is limited, we focus on developing a robust observing and detection pipeline. Using Two-Line Element (TLE) data, we predict satellite transit times to guide the observations, and we define entry into the field of view (FoV) as an apparent declination greater than $85^{\circ}$ with respect to the 21CMA. We analyze the system equivalent flux density (SEFD) and the resulting single-pod sensitivity limits, which explain the detection of emission originating from the ORBCOMM satellites, rather than any detectable broadband UEMR in our dynamic spectra. To validate the methodology, we developed a Python package, orbdemod, to demodulate ORBCOMM downlink signals in our data. The recovered satellite ID agrees with the satellite predicted by our maximum-declination analysis, thereby validating the accuracy of our transit prediction and identification framework. Furthermore, via modulation power spectrum analysis, we show that the impulsive broadband bursts are produced by power line arcing near the array rather than by satellite UEMR.
- New
- Research Article
- 10.1088/1361-6560/ae639d
- May 12, 2026
- Physics in Medicine & Biology
- Allison Sydney Lowe + 7 more
Objective.MRI of the hand and wrist region, containing small structures such as ligaments, peripheral nerves, and thin cartilage, is challenging, requiring both high spatial resolution and coverage, along with adequate signal-to-noise ratio (SNR) for image quality. This work aimed to evaluate whether a high-density, prototype flexible coil array could better visualize small anatomic structures without compromising SNR or field-of-view (FOV), compared to conventional hand and wrist coils.Approach.A flexible, 32-channel phased-array prototype coil was developed for 3-Tesla hand and wrist MRI using novel, small dual-loop elements. A silicon oil phantom was used to analyze SNR and noise amplification arising from acceleration factors (R) of 2-5. Confirmation of the prototype's phantom performance was done in wrist MRI comparisons in two healthy volunteers and against two 16-channel commercially available coils for superior-inferior (SI) coverage and at different acceleration factors (R= 1.5-4). Additional imaging in 6 patients was performed in the hand, wrist, elbow, brachial plexus, and foot for anecdotal, qualitative assessment of the coil's performance.Main Results.Phantom testing demonstrated ∼26% higher SNR and 19.2%-192.4% lowerg-factors in the prototype compared to a 16-channel conventional coil. In comparisons against other coils, the prototype coil demonstrated larger SI coverage of 23.9 cm compared to 20.5 cm and 20.6 cm in conventional coils. Image quality was maintained at higher acceleration factors. Patient imaging successfully demonstrated visualization of small structures.Significance.The 32-channel prototype coil's extended FOV, improved SNR, and flexible design enabled accelerated scans with high spatial resolution and broader anatomical coverage, offering technical improvements over conventional coil designs.
- New
- Research Article
- 10.1088/1361-6560/ae64a4
- May 11, 2026
- Physics in Medicine & Biology
- Razieh Azizi + 3 more
Objective.Cone beam computed tomography (CBCT) often has a truncated acquired field of view (FOV) due to the limited detector size, leading to image reconstruction from truncated projection data. CBCT reconstructions using an image volume that just encloses the acquired FOV exhibit reconstruction artifacts due to attenuation in the tissues outside the image volume. On the other hand, extending the high-resolution voxel volume far enough beyond the FOV to fully enclose the imaged body often leads to a significant increase of the computational complexity in model based iterative reconstruction techniques. We propose a multi-resolution reconstruction model that eliminates the out-of-FOV reconstruction artifacts and enables accurate recovery of Hounsfield unit (HU) values within the FOV.Approach.We propose a multi-resolution extended reconstruction volume (MR-ERV) approach that extends the image volume beyond the FOV using separate extension volumes with coarser voxel representation, leading to appropriate modeling of the observed rays outside the FOV without significant increase of the computational complexity. Furthermore, we demonstrate that by augmenting the model with a simple projection extrapolation yields a further reduction of the out-of-FOV artifacts. In this study, the model is evaluated with model based iterative reconstruction minimization using high-resolution 3D CBCT data. The optimization problems considered are non-negativity constrained least-squares estimation, with and without regularization. The optimization is performed using a primal-dual hybrid gradient algorithm.Results.The proposed MR-ERV model effectively removes out-of-FOV reconstruction artifacts and it also achieves accurate HU values within the FOV when the volume extension fully encloses the imaged body in the transaxial direction.Significance.The MR-ERV model provides a platform for computationally efficient and accurate model based iterative reconstruction of CBCT data.
- New
- Research Article
- 10.1038/s41598-026-52248-6
- May 11, 2026
- Scientific reports
- Anton Sheahan Quinsten + 10 more
Accurate field-of-view (FoV) prescription in oblique coronal and axial planes is essential for high-quality prostate MRI but remains operator-dependent and variable. We developed and evaluated a ResNet-based deep learning framework for automated FoV planning. In this retrospective multicenter study, FoV prescriptions were annotated on PI-CAI dataset. Three readers assessed intra- and inter-rater variability to establish reference consistency. Three neural network variants were trained on 1,474 examinations from PI-CAI dataset (2012-2021), and the optimal model was selected by internal validation. Generalizability and clinical utility were tested on three external cohorts totaling 530 examinations (2021-2024) using a non-inferiority design. The selected model achieved non-inferior performance for slice positioning, with differences ranging from 0.16 ± 0.99 to 0.37 ± 0.48. Across sites, FoV overlaps ranged from 82.4 ± 4.1% to 88.7 ± 6.0%, and the angle differences between predicted and reference planes were 4.66 ± 4.89° (Site I), 3.46 ± 2.80° (Site II), and 2.99 ± 2.90° (Site III). Clinical utility was high at all sites, with acceptability rates of 97.9%, 97.7%,98.8%, 98.1% and 98.1% for Site I (Raters 1-5), 95.7%, 97.8%, 100%, 95.7% and 97.8% for Site II (Raters 1-5), and 100% for all raters at Site III. These findings demonstrate the feasibility of automated FoV positioning for prostate MRI and indicate excellent clinical utility.
- New
- Research Article
- 10.1073/pnas.2602705123
- May 11, 2026
- Proceedings of the National Academy of Sciences
- Huayu Hou + 15 more
In vivo microscopy (IVM) has shown great promise to improve early detection of epithelial precancer, but it suffers from fundamental trade-offs that limit the resolution, field-of-view (FOV) and depth-of-field (DOF). Here, we present PrecisionView, a compact, deep learning-enabled endomicroscope that breaks these constraints and achieves 20 mm2 FOV and 500 µm DOF with 4 µm resolution, representing approximately 5× increase in FOV and 8× larger DOF compared to conventional IVM with similar resolution. PrecisionView integrates a deep learning-optimized phase mask and real-time reconstruction, enabling rapid in vivo assessment of two key hallmarks of cancer: epithelial cell nuclear morphology and subsurface microvasculature through fluorescence and reflectance imaging. By imaging the oral cavity of healthy volunteers and cervical specimens with precancerous lesions, PrecisionView generates large-scale (1 to 3 cm2) coregistered maps of cellular and vascular structures, revealing distinct microscopic patterns associated with anatomic structures and precancerous lesions. Our results suggest the potential of this computational endomicroscope to address the unmet need for early cancer detection at the point of care.
- Research Article
- 10.1002/advs.75390
- May 7, 2026
- Advanced science (Weinheim, Baden-Wurttemberg, Germany)
- Lintao Peng + 5 more
Noninvasive imaging through scattering media is crucial for diverse applications but remains constrained by a narrow field of view (FOV). Although recent learning-based methods have a larger FOV, they often require large-scale real experimental datasets and struggle when the FOV is far beyond the optical memory effect (OME). Here, we propose a physics-guided adaptive dual-domain diffusion model for ultra-wide-field noninvasive imaging through scattering media, namely UNI-Net. Specifically, we first develop a physical scattering imaging model to synthesize large-scale pre-training data, thereby reducing dependence on real experimental datasets. Second, to maximize the utilization of speckle information, we partition each speckle pattern into multi-channel patches to guide the diffusion process. Third, we propose a spatial-channel parallel attention block to model the spatial sparsity and inter-channel similarity of speckle patches with linear complexity. Extensive experiments show that our method cuts reliance on real experimental data by an order of magnitude and achieves a PSNR of 31.23 dB at a 41 OME range in complex scenes, which is 49.5% higher than existing approaches while requiring significantly lower computational and memory costs. Even at an extreme 164 OME range where other methods fail, it still reliably reconstructs complex scenes with a PSNR of 27.21dB.
- Research Article
- 10.1038/s41467-026-71832-y
- May 7, 2026
- Nature communications
- Henry Crawford-Eng + 5 more
Integrated optical phased arrays (OPAs) have emerged as a promising technology for many applications due to their ability to dynamically control free-space optical beams in a compact and non-mechanical manner. However, these integrated OPAs typically have a restricted field of view (FOV), limited by grating lobes caused by large antenna pitches that are typically necessary to reduce crosstalk between the antennas in the integrated OPA. In this work, we develop and experimentally demonstrate for the first time, to the best of our knowledge, a set of integrated grating-based antennas with significantly-reduced inter-antenna crosstalk that enable half-wavelength-pitch integrated OPAs with grating-lobe-free and wide-FOV functionality. First, we derive a generalized theoretical model to describe the coupling dynamics between lossy modes in a system and use this model to analyze the coupling between antennas. Next, we design and demonstrate a set of three integrated grating-based antennas with different propagation coefficients to enable reduced inter-antenna crosstalk, successfully measuring a significant reduction from 100% to 1% coupling. Finally, using these reduced-crosstalk antennas, we develop and demonstrate a half-wavelength-pitch integrated OPA, successfully demonstrating grating-lobe-free and wide-FOV functionality. This work facilitates new functionality for high-performance integrated OPAs.
- Research Article
- 10.1016/j.jmr.2026.108080
- May 6, 2026
- Journal of magnetic resonance (San Diego, Calif. : 1997)
- Jun-Qi Yang + 5 more
A scalable design framework for single-sided permanent magnet arrays (PMAs).
- Research Article
- 10.1093/bjrai/ubag010
- May 4, 2026
- BJR|Artificial Intelligence
- Hideaki Hirashima + 5 more
Abstract Objectives This study aimed to restore missing regions from the limited field of view (FOV) using image- and sinogram-based conditional GAN (cGAN) models. Methods cGANs are deep learning frameworks that generate realistic data via a competitive neural network process. We used planning CT (pCT) datasets from 96 patients: 64 for training, 16 for validation, and 16 for internal testing. Two cGAN models (image-based and sinogram-based) were developed to generate body contour outside the FOV. Next, 23 cone-beam CT (CBCT) datasets were evaluated as an external test group. Results In pCT internal test datasets, the median values for mean absolute error (MAE), root mean square error (RMSE), and structural similarity index measure (SSIM) for each model were as follows: image-based model—101.73 HU for MAE, 39.26 HU for RMSE, and 0.83 for SSIM; sinogram-based model—16.91 HU for MAE, 23.19 HU for RMSE, and 0.91 for SSIM. In CBCT external test datasets, the sinogram-based model outperformed the image-based model with a median MAE of 73.32 HU versus 180.72 HU, a median RMSE of 37.02 HU versus 43.42 HU, and a median SSIM of 0.75 versus 0.63. The sinogram-based model demonstrated significant improvements in MAE, RMSE, and SSIM (p &lt; 0.05). Conclusions The sinogram-based cGAN model exhibits considerable potential for restoring missing regions outside the FOV, outperforming the image-based model in accuracy metrics. Advances in knowledge This model offers a novel approach to accurately predict missing regions from a limited FOV, enhancing continuity of the body contour while accommodating patient-specific variations.
- Research Article
- 10.1002/mp.70472
- May 1, 2026
- Medical physics
- Liqiang Ren + 9 more
Contrast enhancement is the most sensitive indicator for detecting breast malignancies. Computed tomography (CT) has had a limited role for the locoregional staging of breast tumors due to low soft tissue contrast. To evaluate and optimize the performance of clinical photon-counting computed tomography (PCCT) for breast cancer imaging using a contrast-enhanced mammography (CEM) phantom and to compare its imaging performance with dual-source dual-energy CT (DS-DECT). A CEM phantom containing simulated breast lesions was positioned on an anthropomorphic thoracic phantom and scanned using a clinical PCCT system at 120kV in multi-energy mode and a DS-DECT system with two kV pairs of 70/Sn150 kV and 90/Sn150 kV. PCCT scanner variables included scan mode [standard resolution (SR) and ultra-high resolution (UHR)], field of view (FOV) size (large and small), matrix size (512 and 1024), and type of image used for analysis [low-energy threshold images, virtual monoenergetic images (VMIs) at 50, 60, and 70keV, and iodine maps]. Quantitative analysis was performed using circular regions of interest (ROIs) placed on iodine-containing lesions and background within the phantom. For each ROI, mean CT numbers or iodine concentrations and standard deviations were measured across the central five slices and three independent scans. Contrast-to-noise ratio (CNR) and circularity were evaluated across all PCCT configurations and image types and compared with those obtained from DS-DECT. Among all PCCT configurations, the UHR mode with a small FOV and either a 512or 1024matrix at 50keV VMI achieved the highest combined CNR across all iodine concentrations. Additionally, the UHR mode with a 512matrix and either small or large FOV yielded the highest combined circularity values. The optimal PCCT configuration achieved higher CNR, and higher or comparable circularity compared with 50keV VMIs derived from DECT scans. This phantom study demonstrated that optimal spectral performance for potential breast cancer imaging with PCCT is achieved using UHR mode, low-keV VMIs, a regular matrix size, and dedicated reconstruction FOVs, outperforming DECT.
- Research Article
- 10.1002/mp.70488
- May 1, 2026
- Medical physics
- Yiqun Han + 6 more
Cone beam computed tomography (CBCT) is widely used in clinical practice and small animal research for image guidance. The reconstruction quality will be compromised by severe truncation-related artifacts when the scanned object is not fully covered by the field of view (FOV). This work aims to develop a Dual-Domain Deep learning-based method for CBCT Reconstruction from Truncated projections (D3CRT) through the guidance of non-truncated prior information. The D3CRT comprised sequential procedures in both projection and image domains. First, in projection domain, a Sinogram Generation Network (SG-Net) based on the denoising diffusion probabilistic model (DDPM) was employed to predict the missing projection data outside the FOV. The SG-Net was fine-tuned via transfer learning using non-truncated prior data to achieve object-specific adaptation. FDK reconstruction was subsequently performed using the predicted projections. Second, in image domain, an Image Enhancement Network (IE-Net) was applied to refine the FDK reconstructed images. Compressed sensing (CS) reconstruction was then carried out to enforce data fidelity by incorporating the original projections, followed by a secondary IE-Net for final image quality enhancement. In-vivo small animal experiments were conducted on a micro-CBCT system to validate the D3CRT method, with non-truncated prior data obtained from large-FOV low-resolution scans. Dice similarity coefficient (DSC), structural similarity index measure (SSIM), root mean square error (RMSE) were used for quantitative evaluation. The proposed D3CRT effectively improves the image reconstruction quality under truncated projection conditions. For whole-body and lung regions, D3CRT achieved DSCs of 97.1% and 96.0%, outperforming the low-resolution prior images (DSCs of 96.8% and 89.9%) when compared with the reference region segmentations. Quantitative evaluations within the FOV yielded an average RMSE of 2.95 and an SSIM of 98.1% for D3CRT, demonstrating better performance than the Low Resolution Image Constrained Reconstruction (LRICR) method which directly takes low-resolution prior images as the initial inputs for CS reconstruction (RMSE 3.83 , SSIM 97.4%). By leveraging non-truncated prior information and projection-domain transfer learning, the proposed D3CRT effectively improved the overall quality of CBCT reconstruction from truncated projections.
- Research Article
- 10.1121/10.0043843
- May 1, 2026
- The Journal of the Acoustical Society of America
- Mick Gardner + 2 more
Several applications of medical ultrasound can benefit from a larger field of view (FOV). This study is aimed at increasing the FOV of linear array probes by increasing the element width. Coupled elements were used to imitate a larger element width. Through Fourier analysis, theoretical pressure amplitudes, and bandwidth estimates, coupled elements are shown to be close approximations of large elements. The effects of coupling on resolution, contrast, and speckle signal-to-noise ratio are investigated through phantom images and in vivo images of a rabbit tumor reconstructed with plane wave compounding. Furthermore, a positioning system was used to acquire data from a virtual large aperture with 120 mm FOV and 128 elements, collected in sections with a single probe. The null subtraction imaging (NSI), sign coherence factor, and minimum variance (MV) beamformers are compared for regaining resolution lost by an increased F-number. The NSI beamformer decreased full-width at half-max estimates of wire targets by 79% with coupling by 2 compared to uncoupled DAS. The MV beamformer was best for maintaining speckle statistics while improving resolution. Our results demonstrate how increased element width can increase FOV with no increase to element count.
- Research Article
- 10.1109/tvcg.2026.3679095
- May 1, 2026
- IEEE transactions on visualization and computer graphics
- Shiyu Li + 5 more
Multi-camera dynamic Augmented Reality (AR) applications require a camera pose estimation to leverage individual information from each camera in one common system. This can be achieved by combining contextual information, such as markers or objects, across multiple views. While commonly cameras are calibrated in an initial step or updated through the constant use of markers, another option is to leverage information already present in the scene, like known objects. Another downside of marker-based tracking is that markers have to be tracked inside the field-of-view (FoV) of the cameras. To overcome these limitations, we propose a constant dynamic camera pose estimation leveraging spatiotemporal FoV overlaps of known objects on the fly. To achieve that, we enhance the state-of-the-art object pose estimator to update our spatiotemporal scene graph, enabling a relation even among non-overlapping FoV cameras. To evaluate our approach, we introduce a multi-camera, multi-object pose estimation dataset with temporal FoV overlap, including static and dynamic cameras. Furthermore, in FoV overlapping scenarios, we outperform the state-of-the-art on the widely used YCB-V and T-LESS dataset in camera pose accuracy. Our performance on both previous and our proposed datasets validates the effectiveness of our marker-less approach for AR applications. The code and dataset are available on https://github.com/roth-hex-lab/IEEE-VR-2026-MultiCam.
- Research Article
- 10.1016/j.jmir.2026.102197
- May 1, 2026
- Journal of medical imaging and radiation sciences
- Anton Sheahan Quinsten + 10 more
Accurate prescription of oblique coronal and oblique sagittal field of views (FOV) is essential for diagnostic shoulder MRI. Manual planning is radiographer-dependent, time-consuming, and subject to inter- and intra-operator variability, leading to inconsistent image quality and incomplete coverage. Although deep learning (DL) has advanced automated scan planning in non-oblique planes, oblique shoulder prescriptions remain underexplored; an automated DL approach could standardize FOV prescription, reduce operator dependence, and improve reproducibility and workflow without compromising diagnostic quality. In this retrospective multicenter study, 575 shoulder MRI examinations (2019-2025) from four sites were included. Sites A (n=151) and B (n=220) were used for training; testing was performed on sites C (n=61), and D (n=143). A two-stage pipeline was implemented using five oriented bounding box (OBB) variants of YOLOv11 (n, s, m, l, x): Stage 1 performed slice selection; Stage 2 performed FOV prescription. Performance was evaluated against radiographers' prescriptions using mean absolute slice difference (MASD, slices), intersection over union (IoU), and mean absolute angle difference (MAAD, degrees). Clinical utility was assessed by three raters. The YOLOv11-OBB-l model achieved the lowest MASD for Stage 1 (1.016±0.153 slices). For Stage 2, YOLOv11-OBB-x performed best (coronal IoU, 0.847±0.003; sagittal IoU, 0.852±0.007; MAAD, 3.259±0.190°). During testing across each site, MASD ranged from 0.700±0.837 to 1.192±2.550 slices; MAAD from 2.811±2.348 to 4.396±7.158°; coronal IoU from 0.800±0.092 to 0.872±0.065; and sagittal IoU from 0.824±0.111 to 0.887±0.047. Mean clinical utility was 97.2%. Performance was noninferior to interrater variability across all sites and metrics. DL-based automated FOV prescription for shoulder MRI achieves performance comparable to radiographers, generalizes across institutions, and demonstrates high clinical utility.