Discovery Logo
Sign In
Search
Paper
Search Paper
R Discovery for Libraries Pricing Sign In
  • Home iconHome
  • My Feed iconMy Feed
  • Search Papers iconSearch Papers
  • Library iconLibrary
  • Explore iconExplore
  • Ask R Discovery iconAsk R Discovery Star Left icon
  • Literature Review iconLiterature Review NEW
  • Chat PDF iconChat PDF Star Left icon
  • Citation Generator iconCitation Generator
  • Chrome Extension iconChrome Extension
    External link
  • Use on ChatGPT iconUse on ChatGPT
    External link
  • iOS App iconiOS App
    External link
  • Android App iconAndroid App
    External link
  • Contact Us iconContact Us
    External link
  • Paperpal iconPaperpal
    External link
  • Mind the Graph iconMind the Graph
    External link
  • Journal Finder iconJournal Finder
    External link
Discovery Logo menuClose menu
  • Home iconHome
  • My Feed iconMy Feed
  • Search Papers iconSearch Papers
  • Library iconLibrary
  • Explore iconExplore
  • Ask R Discovery iconAsk R Discovery Star Left icon
  • Literature Review iconLiterature Review NEW
  • Chat PDF iconChat PDF Star Left icon
  • Citation Generator iconCitation Generator
  • Chrome Extension iconChrome Extension
    External link
  • Use on ChatGPT iconUse on ChatGPT
    External link
  • iOS App iconiOS App
    External link
  • Android App iconAndroid App
    External link
  • Contact Us iconContact Us
    External link
  • Paperpal iconPaperpal
    External link
  • Mind the Graph iconMind the Graph
    External link
  • Journal Finder iconJournal Finder
    External link
features
  • Audio Papers iconAudio Papers
  • Paper Translation iconPaper Translation
  • Chrome Extension iconChrome Extension
Content Type
  • Journal Articles iconJournal Articles
  • Conference Papers iconConference Papers
  • Preprints iconPreprints
  • Seminars by Cassyni iconSeminars by Cassyni
More
  • R Discovery for Libraries iconR Discovery for Libraries
  • Research Areas iconResearch Areas
  • Topics iconTopics
  • Resources iconResources

Related Topics

  • Stereo Matching
  • Stereo Matching

Articles published on Stereo Network

Authors
Select Authors
Journals
Select Journals
Duration
Select Duration
125 Search results
Sort by
Recency
  • Research Article
  • 10.1145/3799231
APMVS: Learning Multi-View Stereo Based on Adjacent Stage and Pair-Wise Stage Uncertainty Estimation
  • Apr 20, 2026
  • ACM Transactions on Multimedia Computing, Communications, and Applications
  • Mingwei Cao + 6 more

Many multi-view stereo (MVS) networks with a cascaded structure can effectively estimate depth while saving memory. However, the accuracy of the depth map in the fine stage depends on the depth map estimated in the coarse stage. Additionally, the multi-stage depth maps generated by the cascaded structure are used to compute losses but are not reused, resulting in a loss of inter-stage differentiation information. To address these issues, we propose a dual-uncertainty estimation MVS method that learns an MVS network based on adjacent stage and pair-wise stage uncertainty estimation, named APMVS. The core of the proposed APMVS is to employ dual-uncertainty estimation to mitigate the adverse effects of the cascaded structure. Specifically, it involves two estimation modules: adjacent stage uncertainty (ASU) and pair-wise stage uncertainty (PSU). The ASU estimation module dynamically adjusts the depth-hypothesis range by leveraging uncertainty from the previous stage, thereby improving the accuracy of depth-map prediction in the current stage. The PSU estimation module estimates the uncertainty between each pair of stages. Thus, regions with high uncertainty have minimal impact. We evaluate the proposed APMVS on the DTU, Tanks and Temples, and BlendedMVS datasets. Experimental results show that our method achieves superior reconstruction quality compared with other state-of-the-art methods.

  • Research Article
  • 10.1109/tetci.2026.3663502
Deep Illumination Invariance Feature Enhancement and Trilateral Feature Aggregation for Stereo Matching Network
  • Apr 1, 2026
  • IEEE Transactions on Emerging Topics in Computational Intelligence
  • Guowei An + 5 more

Stereo matching is an important part of robot vision intelligent environment perception. Currently, reflective regions and large textureless regions, repetitive texture regions, and depth discontinuity regions are still difficult and error-prone for stereo matching tasks. We propose a novel stereo matching network that can effectively solve the above problems. First, to address the difficulty in reflective regions and large textureless regions, we propose the illumination invariance feature enhancing module which can promote the feature extraction network to extract more sufficient features in reflective regions and large textureless regions. The illumination invariance feature enhancing module is a differentiable transformation of the Census-based matching cost computation and can be combined with multiple stereo networks to improve their performance in difficult regions. Then to address the difficulty in repetitive texture regions and depth discontinuity regions, we propose the differentiable trilateral feature aggregation module to make full use of the prior geometric knowledge of color similarity, spatial distance, and edge strength in the scene to enhance the aggregation ability of cost agregation network in depth discontinuity regions and repetitive texture regions. The proposed network is end-to-end and is verified by extensive experiments. The results show that the proposed method can effectively solve the problems of stereo matching in the above difficult regions and has achieved the competitive performance on Scene Flow datasets, KITTI 2012 datasets, and KITTI 2015 datasets.

  • Research Article
  • 10.1016/j.neunet.2026.109028
Differential Feature Guidance and Compressed Cost Volume for Large Disparity Stereo Network
  • Apr 1, 2026
  • Neural Networks
  • Xiaoyang Zhao + 4 more

Differential Feature Guidance and Compressed Cost Volume for Large Disparity Stereo Network

  • Research Article
  • 10.3390/vehicles8020028
Real-Time 3D Scene Understanding for Road Safety: Depth Estimation and Object Detection for Autonomous Vehicle Awareness
  • Feb 2, 2026
  • Vehicles
  • Marcel Simeonov + 2 more

Accurate depth perception is vital for autonomous driving and roadside monitoring. Traditional stereo vision methods are cost-effective but often fail under challenging conditions such as low texture, reflections, or complex lighting. This work presents a perception pipeline built around FoundationStereo, a Transformer-based stereo depth estimation model. At low resolutions, FoundationStereo achieves real-time performance (up to 26 FPS) on embedded platforms like NVIDIA Jetson AGX Orin with TensorRT acceleration and power-of-two input sizes, enabling deployment in roadside cameras and in-vehicle systems. For Full HD stereo pairs, the same model delivers dense and precise environmental scans, complementing LiDAR while maintaining a high level of accuracy. YOLO11 object detection and segmentation is deployed in parallel for object extraction. Detected objects are removed from depth maps generated by FoundationStereo prior to point cloud generation, producing cleaner 3D reconstructions of the environment. This approach demonstrates that advanced stereo networks can operate efficiently on embedded hardware. Rather than replacing LiDAR or radar, it complements existing sensors by providing dense depth maps in situations where other sensors may be limited. By improving depth completeness, robustness, and enabling filtered point clouds, the proposed system supports safer navigation, collision avoidance, and scalable roadside infrastructure scanning for autonomous mobility.

  • Research Article
  • 10.1016/j.knosys.2025.115229
PDN-MVSNet: A texture-prior-informed multi-view stereo network for weak-texture reconstruction
  • Feb 1, 2026
  • Knowledge-Based Systems
  • Jie Han + 5 more

PDN-MVSNet: A texture-prior-informed multi-view stereo network for weak-texture reconstruction

  • Research Article
  • 10.1016/j.engappai.2025.113600
Quadruplex-depth based multi-view stereo network with wave-shaped depth cells and Epipolar Transformer
  • Feb 1, 2026
  • Engineering Applications of Artificial Intelligence
  • Boyang Song + 3 more

Quadruplex-depth based multi-view stereo network with wave-shaped depth cells and Epipolar Transformer

  • Research Article
  • Cite Count Icon 1
  • 10.1145/3788678
RCAENet: Residual Convolutional and Attention-Enhanced Stereo Matching for Real-Time Depth Estimation on Edge Devices
  • Jan 19, 2026
  • ACM Transactions on Multimedia Computing, Communications, and Applications
  • Bifa Liang + 6 more

As a core technology in real-time video processing and intelligent surveillance, stereo matching provides essential depth perception capabilities for multimedia applications. However, high-precision stereo networks often come with significant computational costs, making real-time inference on power- and memory-constrained edge devices challenging. On the other hand, lightweight real-time networks still struggle with accuracy limitations. To address this challenge, we propose RCAENet, a high-performance stereo network designed for real-time and high-accuracy depth estimation on edge devices. To enhance feature extraction efficiency, we introduce the Residual Convolutional Feature Extraction (RCFE) module, which replaces conventional convolutional layers to capture more expressive features while maintaining computational efficiency. Additionally, we propose the Enhanced Adaptive Upsampling (EAU) module, which integrates channel and spatial attention mechanisms to improve feature fusion and disparity refinement. Furthermore, we design an Enhanced 3D CNN (E3DC) along with the Cost Aggregation and Residual Attention (CA-ResAgg) module for cost volume regularization. This module incorporates residual aggregation and efficient channel attention to further enhance disparity estimation accuracy. Built upon these components, RCAENet features a multi-scale architecture that effectively balances accuracy and efficiency. Extensive experiments demonstrate that these innovations enable RCAENet to achieve real‑time inference on edge devices while maintaining state‑of‑the‑art depth accuracy.

  • Research Article
  • 10.1109/access.2026.3666848
CR-FusionNet: Hybrid Stereo-Monocular Fusion for Depth Perception in Autonomous and Robotic Systems
  • Jan 1, 2026
  • IEEE Access
  • Hanyul Ryu + 3 more

Stereo matching yields metrically grounded disparity but often fails in textureless, repetitive, reflective, or occluded regions, while monocular depth estimation provides dense structure yet suffers from scale ambiguity and weaker local geometric fidelity. We present CR-FusionNet, a lightweight stereo– monocular fusion framework designed for resource-constrained robotic perception. Our pipeline uses CPU-based Semi-Global Block Matching (SGBM) as a deterministic metric anchor and fuses it with a monocular prior through a compact fully convolutional encoder–decoder. Before fusion, we align the monocular prediction to the stereo disparity scale via robust affine regression on valid stereo pixels. CR-FusionNet then performs confidence-aware routing between stereo and monocular cues and applies residual refinement to mitigate SGBM discretization artifacts, producing dense and stable metric depth. Experiments on KITTI Stereo 2015, Middlebury, and TartanAir show consistent improvements over single-source baselines while requiring significantly less GPU memory than deep stereo networks, supporting efficient onboard deployment.

  • Research Article
  • 10.1016/j.eswa.2025.129177
Super resolution enhanced multi-view stereo network based on Gumbel sampling
  • Jan 1, 2026
  • Expert Systems with Applications
  • Shichao Wang + 3 more

Super resolution enhanced multi-view stereo network based on Gumbel sampling

  • Research Article
  • 10.1016/j.aei.2025.104049
3D reconstruction of aerial images with symmetrical gradient integral regression multi-view stereo network
  • Jan 1, 2026
  • Advanced Engineering Informatics
  • Shichao Wang + 3 more

3D reconstruction of aerial images with symmetrical gradient integral regression multi-view stereo network

  • Research Article
  • 10.52549/.v13i4.6686
Robust Stereo Matching for Driver Assistance Systems Under Adverse Driving Conditions
  • Dec 21, 2025
  • Indonesian Journal of Electrical Engineering and Informatics (IJEEI)
  • Vinh Dinh Nguyen + 1 more

Deep stereo networks perform effectively when both training and testing data come from the same domain. However, their accuracy tends to drop significantly in efficiency-focused target scenarios due to domain shifts between training and testing datasets. These shifts often arise from differences in factors such as color, lighting, contrast, and texture. Additionally, the architecture of deep networks generally results in processing times that are unsuitable for real-time applications. To address these issues, this paper proposes a lightweight and robust stereo matching approach tailored for diverse driving environments. It leverages attention mechanisms for feature extraction and uses evolutionary algorithms for optimizing parameters. The method outperforms existing deep learning and traditional stereo matching techniques in terms of both processing speed and the percentage of bad pixels, as demonstrated on three challenging outdoor datasets: KITTI, HCI, and Driving Stereo. These results indicate that the proposed solution is highly effective for real-world applications where both precision and flexibility are essential.

  • Research Article
  • 10.1364/ol.578057
Deep photometric stereo for dynamic single-pixel 3D imaging.
  • Dec 15, 2025
  • Optics letters
  • Chongyang Zhang + 6 more

Dynamic single-pixel 3D imaging is challenging due to the requirement of complex calibration and the inherent tradeoff between resolution and the number of measurements. In this work, we propose a calibration-free framework that integrates binocular single-pixel imaging (SPI) with a super-resolution photometric stereo network (SRPS-Net) to achieve dynamic 3D SPI video. Photometric images reconstructed from arbitrary left and right viewpoints are processed by SRPS-Net to recover accurate surface normals without calibration. Experimental results show that our system achieves dynamic 3D reconstruction at a resolution of 128×128 with a frame rate of 6.5 fps, reaching pixel-level accuracy. The proposed method demonstrates robust generalization to complex objects and gestures, providing a compact, cost-effective, and calibration-free solution for dynamic single-pixel 3D imaging.

  • Research Article
  • 10.1016/j.optlastec.2025.113666
Enhancing photometric stereo via multi-channel information processing
  • Dec 1, 2025
  • Optics & Laser Technology
  • Xiaoyao Wei + 3 more

Enhancing photometric stereo via multi-channel information processing

  • Research Article
  • 10.3390/electronics14224436
Stereo-GS: Online 3D Gaussian Splatting Mapping Using Stereo Depth Estimation
  • Nov 14, 2025
  • Electronics
  • Junkyu Park + 3 more

We present Stereo-GS, a real-time system for online 3D Gaussian Splatting (3DGS) that reconstructs photorealistic 3D scenes from streaming stereo pairs. Unlike prior offline 3DGS methods that require dense multi-view input or precomputed depth, Stereo-GS estimates metrically accurate depth maps directly from rectified stereo geometry, enabling progressive, globally consistent reconstruction. The frontend combines a stereo implementation of DROID-SLAM for robust tracking and keyframe selection with FoundationStereo, a generalizable stereo network that needs no scene-specific fine-tuning. A two-stage filtering pipeline improves depth reliability by removing outliers using a variance-based refinement filter followed by a multi-view consistency check. In the backend, we selectively initialize new Gaussians in under-represented regions flagged by low PSNR during rendering and continuously optimize them via differentiable rendering. To maintain global coherence with minimal overhead, we apply a lightweight rigid alignment after periodic bundle adjustment. On EuRoC and TartanAir, Stereo-GS attains state-of-the-art performance, improving average PSNR by 0.22 dB and 2.45 dB over the best baseline, respectively. Together with superior visual quality, these results show that Stereo-GS delivers high-fidelity, geometrically accurate 3D reconstructions suitable for real-time robotics, navigation, and immersive AR/VR applications.

  • Research Article
  • 10.26599/cvm.2025.9450472
Uncertainty Aware Multiple View Stereo Network with Accurate Supervision
  • Oct 1, 2025
  • Computational Visual Media
  • Xincheng Tang + 4 more

Learning-based multiple view stereo has gained significant attention recently. However, most methods rely on direct network supervision using provided ground-truth depth, which poses three inherent problems: resolution-dependent ground-truth artifacts, excessively challenging training examples (with relatively featureless textures), and use of less-viewed reference pixels for supervision, all of which hinder network optimization. To alleviate these problems, we propose an accurate network supervision paradigm that includes a ground-truth mask, an entropy mask, and a consistency mask, which provide more accurate supervision signals to aid network optimization. Furthermore, we introduce UANet, an uncertainty aware multi-view stereo network, which adaptively determines a pixel-wise search range using a dynamic range sampler (DRS) built upon estimation confidence and learned uncertainty. Experimental results on recent MVS datasets demonstrate the effectiveness of our method.

  • Research Article
  • 10.2478/ijanmc-2025-0021
Research on Multi-View Stereo Network Based on Self-Attention Mechanism
  • Sep 1, 2025
  • International Journal of Advanced Network, Monitoring and Controls
  • Wenkai Li + 3 more

Abstract As the technologies of virtual reality and augmented reality rapidly advance, the demand for high-quality 3D models has been growing exponentially. However, the Multi-View Stereo Network (MVSNet) for 3D reconstruction has faced issues with the inaccurate extraction of global image information and depth cues. In response to these challenges, this paper presents enhancements to MVSNet. First, the self-attention mechanism is introduced to enhance MVSNet's ability to capture global information in images. Second, a residual structure is added to mitigate the accuracy loss caused by the downsampling and upsampling of feature maps during the regularization process of cost volume, thus ensuring the integrity of information and transmission efficiency. Experimental results indicate that, in comparison with the original MVSNet, the SelfRes-MVSNet reduces the error rate by 1.3% in terms of overall accuracy and completeness, thereby improving the reconstruction effect from 2D images to 3D models.

  • Research Article
  • Cite Count Icon 1
  • 10.1016/j.knosys.2025.113911
Generalized deep aerial Multi-view Stereo network based on gradient balance masked representation learning
  • Sep 1, 2025
  • Knowledge-Based Systems
  • Shichao Wang + 3 more

Generalized deep aerial Multi-view Stereo network based on gradient balance masked representation learning

  • Research Article
  • Cite Count Icon 12
  • 10.1109/tpami.2025.3557245
Revisiting One-Stage Deep Uncalibrated Photometric Stereo via Fourier Embedding.
  • Aug 1, 2025
  • IEEE transactions on pattern analysis and machine intelligence
  • Yakun Ju + 5 more

This paper introduces a one-stage deep uncalibrated photometric stereo (UPS) network, namely Fourier Uncalibrated Photometric Stereo Network (FUPS-Net), for non-Lambertian objects under unknown light directions. It departs from traditional two-stage methods that first explicitly learn lighting information and then estimate surface normals. Two-stage methods were deployed because the interplay of lighting with shading cues presents challenges for directly estimating surface normals without explicit lighting information. However, these two-stage networks are disjointed and separately trained so that the error in explicit light calibration will propagate to the second stage and cannot be eliminated. In contrast, the proposed FUPS-Net utilizes an embedded Fourier transform network to implicitly learn lighting features by decomposing inputs, rather than employing a disjointed light estimation network. Our approach is motivated from observations in the Fourier domain of photometric stereo images: lighting information is mainly encoded in amplitudes, while geometry information is mainly associated with phases. Leveraging this property, our method "decomposes" geometry and lighting in the Fourier domain as guidance, via the proposed Fourier Embedding Extraction (FEE) block and Fourier Embedding Aggregation (FEA) block, which generate lighting and geometry features for the FUPS-Net to implicitly resolve the geometry-lighting ambiguity. Furthermore, we propose a Frequency-Spatial Weighted (FSW) block that assigns weights to combine features extracted from the frequency domain and those from the spatial domain for enhancing surface reconstructions. FUPS-Net overcomes the limitations of two-stage UPS methods, offering better training stability, a concise end-to-end structure, and avoiding accumulated errors in disjointed networks. Experimental results on synthetic and real datasets demonstrate the superior performance of our approach, and its simpler training setup, potentially paving the way for a new strategy in deep learning-based UPS methods.

  • Research Article
  • Cite Count Icon 4
  • 10.1109/tpami.2025.3557498
Revisiting Supervised Learning-Based Photometric Stereo Networks.
  • Aug 1, 2025
  • IEEE transactions on pattern analysis and machine intelligence
  • Xiaoyao Wei + 7 more

Deep learning has significantly propelled the development of photometric stereo by handling the challenges posed by unknown reflectance and global illumination effects. However, how supervised learning-based photometric stereo networks resolve these challenges remains to be elucidated. In this paper, we aim to reveal how existing methods address these challenges by revisiting their deep features, deep feature encoding strategies, and network architectures. Based on the insights gained from our analysis, we propose ESSENCE-Net, which effectively encodes deep shading features with an easy-first-encoding strategy, enhances shading features with shading supervision, and accurately decodes normal with spatial context-aware attention. The experimental results verify that the proposed method outperforms state-of-the-art methods on three benchmark datasets, whether with dense or sparse inputs.

  • Research Article
  • 10.1364/ao.561785
Detail-aware multi-view stereo network for depth estimation.
  • Jul 3, 2025
  • Applied optics
  • Haitao Tian + 3 more

Multi-view stereo methods have achieved great success for depth estimation based on the coarse-to-fine depth learning frameworks; however, the existing methods perform poorly in recovering the depth of object boundaries and detail regions. To address these issues, we propose a detail-aware multi-view stereo network with a coarse-to-fine framework. The geometric depth clues hidden in the coarse stage are utilized to maintain the geometric structural relationships between object surfaces and enhance the expressive capability of image features. In addition, an image synthesis loss is employed to constrain the gradient flow for detailed regions and further strengthen the supervision of object boundaries and texture-rich areas. Finally, we propose an adaptive depth interval adjustment strategy to improve the accuracy of object reconstruction. Extensive experiments on the DTU and Tanks & Temples datasets demonstrate that our method achieves competitive results.

  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • .
  • 1
  • 2
  • 3
  • 4
  • 5

Popular topics

  • Latest Artificial Intelligence papers
  • Latest Nursing papers
  • Latest Psychology Research papers
  • Latest Sociology Research papers
  • Latest Business Research papers
  • Latest Marketing Research papers
  • Latest Social Research papers
  • Latest Education Research papers
  • Latest Accounting Research papers
  • Latest Mental Health papers
  • Latest Economics papers
  • Latest Education Research papers
  • Latest Climate Change Research papers
  • Latest Mathematics Research papers

Most cited papers

  • Most cited Artificial Intelligence papers
  • Most cited Nursing papers
  • Most cited Psychology Research papers
  • Most cited Sociology Research papers
  • Most cited Business Research papers
  • Most cited Marketing Research papers
  • Most cited Social Research papers
  • Most cited Education Research papers
  • Most cited Accounting Research papers
  • Most cited Mental Health papers
  • Most cited Economics papers
  • Most cited Education Research papers
  • Most cited Climate Change Research papers
  • Most cited Mathematics Research papers

Latest papers from journals

  • Scientific Reports latest papers
  • PLOS ONE latest papers
  • Journal of Clinical Oncology latest papers
  • Nature Communications latest papers
  • BMC Geriatrics latest papers
  • Science of The Total Environment latest papers
  • Medical Physics latest papers
  • Cureus latest papers
  • Cancer Research latest papers
  • Chemosphere latest papers
  • International Journal of Advanced Research in Science latest papers
  • Communication and Technology latest papers

Latest papers from institutions

  • Latest research from French National Centre for Scientific Research
  • Latest research from Chinese Academy of Sciences
  • Latest research from Harvard University
  • Latest research from University of Toronto
  • Latest research from University of Michigan
  • Latest research from University College London
  • Latest research from Stanford University
  • Latest research from The University of Tokyo
  • Latest research from Johns Hopkins University
  • Latest research from University of Washington
  • Latest research from University of Oxford
  • Latest research from University of Cambridge

Popular Collections

  • Research on Reduced Inequalities
  • Research on No Poverty
  • Research on Gender Equality
  • Research on Peace Justice & Strong Institutions
  • Research on Affordable & Clean Energy
  • Research on Quality Education
  • Research on Clean Water & Sanitation
  • Research on COVID-19
  • Research on Monkeypox
  • Research on Medical Specialties
  • Research on Climate Justice
Discovery logo
FacebookTwitterLinkedinInstagram

Download the FREE App

  • Play store Link
  • App store Link
  • Scan QR code to download FREE App

    Scan to download FREE App

  • Google PlayApp Store
FacebookTwitterTwitterInstagram
  • Universities & Institutions
  • Publishers
  • R Discovery PrimeNew
  • Ask R Discovery
  • Blog
  • Accessibility
  • Topics
  • Journals
  • Open Access Papers
  • Year-wise Publications
  • Recently published papers
  • Pre prints
  • Questions
  • FAQs
  • Contact us
Lead the way for us

Your insights are needed to transform us into a better research content provider for researchers.

Share your feedback here.

FacebookTwitterLinkedinInstagram
Cactus Communications logo

Copyright 2026 Cactus Communications. All rights reserved.

Privacy PolicyCookies PolicyTerms of UseCareers