Discovery Logo
Sign In
Search
Paper
Search Paper
R Discovery for Libraries Pricing Sign In
  • Home iconHome
  • My Feed iconMy Feed
  • Search Papers iconSearch Papers
  • Library iconLibrary
  • Explore iconExplore
  • Ask R Discovery iconAsk R Discovery Star Left icon
  • Literature Review iconLiterature Review NEW
  • Chat PDF iconChat PDF Star Left icon
  • Citation Generator iconCitation Generator
  • Chrome Extension iconChrome Extension
    External link
  • Use on ChatGPT iconUse on ChatGPT
    External link
  • iOS App iconiOS App
    External link
  • Android App iconAndroid App
    External link
  • Contact Us iconContact Us
    External link
  • Paperpal iconPaperpal
    External link
  • Mind the Graph iconMind the Graph
    External link
  • Journal Finder iconJournal Finder
    External link
Discovery Logo menuClose menu
  • Home iconHome
  • My Feed iconMy Feed
  • Search Papers iconSearch Papers
  • Library iconLibrary
  • Explore iconExplore
  • Ask R Discovery iconAsk R Discovery Star Left icon
  • Literature Review iconLiterature Review NEW
  • Chat PDF iconChat PDF Star Left icon
  • Citation Generator iconCitation Generator
  • Chrome Extension iconChrome Extension
    External link
  • Use on ChatGPT iconUse on ChatGPT
    External link
  • iOS App iconiOS App
    External link
  • Android App iconAndroid App
    External link
  • Contact Us iconContact Us
    External link
  • Paperpal iconPaperpal
    External link
  • Mind the Graph iconMind the Graph
    External link
  • Journal Finder iconJournal Finder
    External link
features
  • Audio Papers iconAudio Papers
  • Paper Translation iconPaper Translation
  • Chrome Extension iconChrome Extension
Content Type
  • Journal Articles iconJournal Articles
  • Conference Papers iconConference Papers
  • Preprints iconPreprints
  • Seminars by Cassyni iconSeminars by Cassyni
More
  • R Discovery for Libraries iconR Discovery for Libraries
  • Research Areas iconResearch Areas
  • Topics iconTopics
  • Resources iconResources

Related Topics

  • Location Of Features
  • Location Of Features
  • Feature Distance
  • Feature Distance
  • Contour Features
  • Contour Features

Articles published on Features In Depth

Authors
Select Authors
Journals
Select Journals
Duration
Select Duration
1071 Search results
Sort by
Recency
  • Research Article
  • 10.1371/journal.pone.0346343
Detection of submarine pipeline and cable targets based on depth feature of high resolution sonar image
  • Apr 7, 2026
  • PLOS One
  • Dandan Liu + 3 more

For Side-Scan Sonar (SSS) submarine pipeline and cable target feature extraction, there are some problems such as poor real-time performance, high false detection rate, and difficulty in deploying edge equipment.With deep feature technology,this study applies a deep neural network to detect submarine pipeline and cable targets in order to solve the above problems.To enable real-time detection of submarine pipelines and cable in SSS imagery, we improve the YOLO11n-seg model by incorporating the A2C2f and DSConv modules, leveraging the characteristic features of the target images. It reduces the false detection rate of submarine pipeline feature in SSS image,the size of parameters and realize lightweight deployment.In allusion to the Marine-PULSE submarine pipeline and cable dataset,ablation experiments and comparative experiments are designed. The experimental results show significant improvements over the original YOLO11n-seg model. Specifically,the modified model bounding box recall improved by 9.7%,and mAP@50-95 improved by 1.6%; instance segmentation recall improved by 10.3%,and mAP@50 improved by 3.6%. The detection precision and integrity are enhanced synchronously, and the size of parameters is reduced by 15%, which has stronger advantages in real-time performance. Regarding object detection, our model demonstrates superior performance, with its mAP@50 improved by 5.2% compared to YOLO12n-seg and by 12.5% compared to YOLO13n-seg. Experiments show that the model designed in this study is an effective method for real-time detection of SSS submarine pipeline and cable targets, and has a good development prospect and promotion.

  • Research Article
  • 10.1088/2631-8695/ae586b
Railroad transmission line foreign object detection based on multi-scale adaptive kernel deep feature fusion and channel pruning
  • Apr 1, 2026
  • Engineering Research Express
  • Siyuan Liu + 1 more

Abstract Aiming at the problems of leakage, misdetection and low detection efficiency of transmission line foreign object detection in railroad environment, we propose an improved railroad transmission line foreign object detection algorithm MRS-YOLO based on YOLOv11n. Firstly, a multi-scale Adaptive Kernel Depth Feature Fusion (MAKDF) module is proposed and fused with the C3k2 module to form C3k2_MAKDF, which enhances the model's feature extraction capability for foreign objects of different sizes and shapes. Secondly, a novel Re-calibration Feature Fusion Pyramid Network(RCFPN) is designed as a Neck structure to improve the feature fusion capability of the model. Then, Spatial and Channel Reconstruction Detect Head (SC_Detect) based on spatial and channel preprocessing is designed to improve the overall detection accuracy of the model. Finally, the channel pruning technique is used to reduce the redundancy of the improved model, drastically reduce Parameters and GFLOPs, and improve the detection efficiency. The experimental results show that the mAP50 and mAP50:95 of the MRS-YOLO algorithm proposed in this paper are improved to 94.8% and 86.4%, respectively, which are 0.7 and 2.3 percentage points higher compared to the baseline, while Parameters and GFLOPs are reduced by 44.2% and 17.5%, respectively. It is demonstrated that the improved algorithm can be better applied to the task of foreign object detection in railroad transmission lines.

  • Research Article
  • Cite Count Icon 1
  • 10.1016/j.aap.2026.108405
Enhancing vision-based traffic crash detection performance consistency across day-night scenes: A depth-aware and domain-adaptive network.
  • Apr 1, 2026
  • Accident; analysis and prevention
  • Yang Yang + 5 more

Enhancing vision-based traffic crash detection performance consistency across day-night scenes: A depth-aware and domain-adaptive network.

  • Research Article
  • 10.3390/s26061925
Comfort-Oriented Pothole Traversal Using Multi-Sensor Perception and Fuzzy Control.
  • Mar 19, 2026
  • Sensors (Basel, Switzerland)
  • Chaochun Yuan + 7 more

Potholes are typical negative road obstacles that can significantly compromise vehicle safety and ride comfort when traversed at inappropriate speeds. To address this issue, this paper proposes a pothole-detection-based, comfort-oriented pothole traversal algorithm that integrates multi-sensor fusion perception, comfort-constrained speed planning, and fuzzy control. A camera and a single-point ranging LiDAR are first fused to extract key geometric features of potholes, including contour, area, and depth. Based on these features, a vehicle-pothole dynamic model is developed in ADAMS to quantify the influence of pothole area and depth on vehicle vertical vibration. The vertical frequency-weighted root-mean-square (RMS) acceleration is adopted as the ride comfort indicator, based on which the maximum allowable traversal speed under different pothole geometries is determined. Furthermore, a longitudinal pothole traversal control strategy based on fuzzy theory is designed to regulate vehicle acceleration, enabling the vehicle to reach the comfort-constrained limiting speed within a finite preview distance while ensuring braking safety. The proposed method is validated through multi-scenario co-simulations using MATLAB/Simulink and CarSim, as well as real-vehicle experiments. Results demonstrate that the proposed strategy can effectively adjust vehicle speed before pothole traversal, satisfying comfort constraints and improving ride comfort without sacrificing driving safety.

  • Research Article
  • 10.1088/1361-6501/ae4cae
Monocular depth estimation for screw tightness state detection
  • Mar 13, 2026
  • Measurement Science and Technology
  • Jiacheng Wang + 5 more

Abstract Accurate detection of industrial Screw assembly status is crucial for ensuring product quality and safety. This paper proposes an efficient monocular vision–based method for detecting Screw tightness, significantly reducing reliance on expensive depth sensors. By constructing a dataset encompassing various Screw types and fastening states, the system employs a monocular depth estimation model based on Depth Anything V2 and the Dense Prediction Transformer (DPT) to generate relative depth maps. To overcome the limitations of relative depth information, this paper introduces a novel normalized feature extraction method that computes depth differences between the Screw region and its surrounding area to extract robust Screw-state representation features. Building on this, we design a Sparse Convolutional Residual 4-path Network(SCR_R4Net) that integrates a Convolutional Block Attention Module (CBAM) to effectively fuse RGB images with normalized depth features. Finally, the fused features and scalar depth information are fed into a regressor to predict Screw-to-surface distances, and the tightness state is determined via threshold comparison. Experimental results demonstrate that this method can accurately identify subtle variations in Screw states, offering a practical, vision-based alternative for automated Screw tightness monitoring in defined industrial scenarios where stable top-down views are maintained.

  • Research Article
  • 10.3390/rs18050788
PSiam-HDSFNet: A Pseudo-Siamese Hybrid Dilation Spiral Feature Network for Flood Inundation Change Detection Based on Heterogeneous Remote Sensing Imagery
  • Mar 4, 2026
  • Remote Sensing
  • Yichuang Luo + 6 more

Flood change detection from remote sensing data can be used to identify post-disaster flooded areas, providing decision support for emergency rescue and post-disaster reconstruction. Although the combination of SAR and optical images effectively addresses obscuration by clouds and rain, the inherent difference in their imaging mechanisms poses a challenge to improving the accuracy of flood area change detection. Furthermore, existing flood inundation change detection methods based on heterogeneous remote sensing imagery struggle to distinguish small ground objects within the background from the actual inundated regions. Therefore, a pseudo-Siamese hybrid dilation spiral feature network (PSiam-HDSFNet) is proposed in this paper. Firstly, the feature extraction pipeline progressively processes optical and SAR images through five-layer Enhanced Deep Residual Blocks and five-layer Residual Dense Blocks, respectively. A Hybrid Dilated Pyramid (HDP) module based on a sawtooth wave-like dilated coefficient is designed to enhance multi-scale semantics of deep features in order to selectively reinforce semantic features in flood areas and weaken the noise semantics from small ground objects. Then, a Spiral Feature Pyramid (SFP) module is designed to make the deep features of SAR and optical images more consistent in spatial structure and numerical distribution patterns, so that the features of flood areas become more prominent while the noise semantics from small ground objects are further suppressed. After that, the Galerkin-type attention with linear complexity is introduced to the decoder, rapidly reconstructing the abstract semantic information of floods into interpretable flood features. Finally, the Align OPT-SAR (AlignOS) method is designed to align SAR and optical image features, enabling subsequent flood area detection. Seven metrics are adopted in the comparison between PSiam-HDSFNet and the other 14 methods. The results indicate that PSiam-HDSFNet improves change detection accuracy by extracting and processing depth features of these two images without image domain translation, and its F1 scores are improved by 7.704%, 7.664%, 4.353%, and 1.111% in the four flood coverage categories detection tasks compared to the suboptimum.

  • Research Article
  • 10.1016/j.media.2026.103940
Depth-induced prompt learning for laparoscopic liver landmark detection.
  • Mar 1, 2026
  • Medical image analysis
  • Ruize Cui + 6 more

Depth-induced prompt learning for laparoscopic liver landmark detection.

  • Research Article
  • Cite Count Icon 2
  • 10.1016/j.jag.2026.105146
Nightlight–nightlife interplay: Threshold effects of ambient lighting on perceived safety and physical activity at night
  • Mar 1, 2026
  • International Journal of Applied Earth Observation and Geoinformation
  • Yuankai Wang + 5 more

• First city-wide high-fidelity nighttime SVIs generated for Hong Kong. • Image-based brightness validated against field illuminance measurements. • Eye-level brightness outperforms satellite radiance to explain nighttime exercise. • Near-field and light-of-sight lighting facilitate perceived safety. • Perceived safety gains diminish beyond a 40–60 Luma lighting threshold. Artificial light at night (ALAN) influences safety perceptions, public health, energy use, and biodiversity, yet its eye-level impacts remain difficult to quantify at city scale. Meanwhile, reducing outdoor lighting is especially contested because the widely held belief that “brighter is safer” sustains excessive ALAN even in low-crime cities such as Hong Kong. We assert that unveiling the threshold effect of brightness on perceived safety and human behavior is crucial for shifting the linear paradigm and informing acceptable lighting adjustments. However, auditing urban-scale nighttime perceived safety is challenging because nighttime street view imagery (SVI) is almost nonexistent. Most studies rely on limited nighttime samples, satellite radiance, or daytime SVI as proxies, ignoring the spatial heterogeneity of nightscapes and differences between day and night. Moreover, lighting intensity is often examined in silo; how it interacts with urban forms is underexplored. To address these gaps, we developed a multimodal diffusion model trained on 2600 paired day–night SVIs and conditioned on SDGSAT-1 radiance (40 m) and POI context to synthesize context-aware nighttime scenes from 58,500 daytime SVIs (50 m spacing) across Hong Kong. A validated vision-language model was used to score perceived nighttime safety from generated images, while semantic area, brightness, and depth (ABD) features were extracted to model their interactions. To characterize human behaviors, approximately 150,000 volunteered trajectories of physical activities were collected. The ABD interactions extracted from eye-level nightscapes alone explained 49.3% (36.5%) of the variation in day–night activity disparities (perceived nighttime safety), significantly outperforming satellite-based radiance. Near-field, line-of-sight lighting on facades, trees, and sidewalks is consistently associated with higher safety and stronger nighttime activity retention. Brightness shows diminishing gains beyond ∼ 40–60 luma DN (∼19–40 lx as an interpretive reference from our DN–lux field cross-walk) on nighttime perception and physical activity across heterogeneous urban contexts, indicating significant potential for reduced light pollution in overlit areas. This study demonstrates the necessity and provides a scalable toolkit for auditing nighttime environments at eye level to support sustainable and healthy cities.

  • Research Article
  • 10.1093/jas/skag051
Pose estimation based on keypoints and monocular depth estimation for predicting cattle body weight and hip height.
  • Feb 18, 2026
  • Journal of animal science
  • Guilherme L Menezes + 9 more

Computer vision systems (CVS) have been developed using either top-down view 3D images or side-view 2D images to predict body weight (BW) and hip height (HH). However, 3D imaging systems are often costly compared with 2D imagery setups, and side-view cameras are often difficult to deploy under commercial farm conditions due to occlusion and variation in animal distance and posture. Herein, pose estimation using top-down view 2D images offers a promising approach to automatically extract keypoint-based features that describe body biometrics and could provide information correlated with BW and HH. Additionally, the same 2D images could be used to generate depth information, such as volume and height, using monocular depth estimation (MDE). Therefore, this study aimed to (1) develop predictive models for BW and HH based on features extracted from body pose keypoints in 2D infrared images and depth images generated using MDE, and (2) compare these models with those using features extracted from depth images collected by a 3D imaging system. A total of 395 top-down view videos from 94 beef-on-dairy crossbred cattle across four experimental blocks were collected using infrared and depth sensors. BW was recorded using an electronic scale, and HH was manually measured using a measuring stick. A pose estimation model identified seven anatomical landmarks (i.e. keypoints). The same 2D infrared images were converted into 3D images using zero-shot MDE, and a pipeline extracted features including volume, area, circularity, eccentricity, as well as back heights and widths. Depth images from a 3D imaging system were processed using the same pipeline. Random Forest (RF), Partial Least Squares Regression (PLS), and Support Vector Regression (SVM) models were evaluated using a leave-one-block-out cross-validation approach. The PLS model using Euclidean distances between the keypoints as features achieved R2 values of 0.90 with a Root Mean Square Error (RMSE) of 33.1 kg. Using MDE-derived depth features, PLS achieved an R2 of 0.95 with an RMSE of 24.2 kg. For HH, PLS using keypoints achieved an R2 of 0.77 with RMSE of 3.2 cm, and models using MDE-derived depth features showed similar performance. Our findings demonstrate that biometric features extracted from top-down 2D images or MDE-derived depth features enable comparable predictive performance for BW and HH, with models using features extracted from depth images collected using a 3D imaging system.

  • Research Article
  • 10.1007/s11517-025-03512-w
Cardiac multi-structure segmentation network based on the fused dual attention mechanism.
  • Feb 10, 2026
  • Medical & biological engineering & computing
  • Guodong Zhang + 6 more

Cardiac segmentation and quantification of cardiac function indicators play a crucial role in the clinical diagnosis and treatment of cardiovascular diseases. To address the issue of blurred cardiac chamber boundaries and adjacent tissue interference resulting from similar intensity in computed tomograph (CT) images, this paper proposes a 3D cardiac multi-structure segmentation network utilizing Multi-scale Channel Enhancement Attention (MCEA) and Spatial Decomposition with Channel Fusion Attention (SD-CA). The MCEA module integrates channel information from feature maps of various scales within the coding layer, thereby enhancing contextual linkage, strengthening the network's multi-scale feature representation capability, and improving decoding and segmentation performance. The SD-CA module generates spatial and channel attention weights in parallel and combines the three directional features of height, width, and depth. This enables the network to effectively concentrate on the region of interest and mitigate the interference of irrelevant structures. Experimental evaluations were conducted using a dataset of 192 cases provided by the People's Hospital of Liaoning Province and the MM-WHS dataset. Segmentation was achieved for the left ventricle, myocardium, left atrium, right ventricle, and right atrium, with average Dice coefficients of 94.21% and 93.9%, and average 95% Hausdorff distances of 6.5483 and 4.36, respectively. Furthermore, quantitative predictions of the left ventricular ejection fraction (LVEF) and substructure volumes were derived from the segmentation results. The correlation coefficients between the predicted and true values exceeded 0.9587, and all fell within the maximum error range of the Bland-Altman test for over 94.8% of the data, indicating a strong correlation and agreement between the predicted and true values.

  • Research Article
  • 10.3390/app16041673
Small Object Detection with Efficient Multi-Scale Collaborative Attention and Depth Feature Fusion Based on Detection Transformer
  • Feb 7, 2026
  • Applied Sciences
  • Boran Song + 4 more

Existing DEtection TRansformer-based (DETR) object detection methods have been widely applied to standard object detection tasks, but still face numerous challenges in detecting small objects. These methods frequently miss the fine details of small objects and fail to preserve global context, particularly under scale variation or occlusion. The resulting feature maps lack sufficient spatial and structural information. Moreover, some DETR-based models specifically designed for small object detection often have poor generalization capabilities and are difficult to adapt to datasets with diverse object scales and complex backgrounds. To address these issues, this paper proposes a novel object detection model—small object detection with efficient multi-scale collaborative attention and depth feature fusion based on DETR (ED-DETR)—which consists of three core modules: an efficient multi-scale collaborative attention mechanism (EMCA), DepthPro, a zero-shot metric monocular depth estimation model, and an adaptive feature fusion module for depth maps and feature maps. Specifically, EMCA extends the single-space attention mechanism in efficient multi-scale attention (EMA) to a composite structure of parallel spatial and channel attention, enhancing ED-DETR’s ability to express features collaboratively in both spatial and channel dimensions. DepthPro generates depth maps to extract depth information. The adaptive feature fusion module integrates depth information with RGB visual features, improving ED-DETR’s ability to perceive object position, scale, and occlusion. The experimental results show that ED-DETR achieves the current best 33.6% mAP on the AI-TOD-V2 dataset, which predominantly contains tiny objects, outperforming previous CNN-based and DETR-based methods, and shows excellent generalization performance on the VisDrone and COCO datasets.

  • Research Article
  • 10.1038/s41598-026-35527-0
An improved seam carving method for enhancing the visual field of tunnel vision patients.
  • Feb 3, 2026
  • Scientific reports
  • Dina El-Torky + 9 more

Visual impairment has various forms all of which negatively affect the patient's daily activities and prevent performing simple actions like walking safely in a street. Content-aware image retargeting can be used to enhance the scene for patients who have limited visual field i.e. tunnel vision. A modified Seam Carving method is presented in this research paper which can decrease the width of the input image to fit in the patient's angle of vision while preserving the important objects in the original image as well as the image details. The method enhanced the original Seam Carving by calculating the energy map using multiscale image fusion that combines depth, saliency, foreground segmentation, and edge detection features, and used a forward-middle approach for the seam removal step. The results showed efficiency that outperformed various retargeting methods, achieving a 30.8% improvement in the composite score that integrates structural, perceptual, and feature-based quality metrics. Statistical analysis using paired t-tests ([Formula: see text]) confirmed statistically significant improvements across all major metrics ([Formula: see text]), including SSIM, SIFT feature matching, and modern deep learning-based perceptual quality metrics, compared to the baseline seam carving method.

  • Research Article
  • 10.14358/pers.25-00028r3
Graph Neural Network-Based Land Cover-Classification of Remote Sensing Images Using Multi-Scale and Depth Features
  • Feb 1, 2026
  • Photogrammetric Engineering & Remote Sensing
  • Jiexi Liu + 3 more

Remote sensing land-cover classification can provide valuable data support for natural resource management. Existing classification methods based on graph-neural networks rely mainly on the global features and non-Euclidean structural features of image objects without considering the local features that describe their internal structures and the raster-depth features in the form of Euclidean structures. To this end, this paper presents a multi-scale and deep-feature, remote sensing???image land cover???classification method that embeds raster-depth features into node features and captures multi-scale graph-embedding information from global graphs and subgraphs to fully express image information. The depth-feature map of the image is obtained through a visual geometry Group 16???layer network and integrated into the feature space. The fractal network evolution algorithm is adopted to obtain multi-scale image objects. Global-scale features such as spectral, texture, index, and raster-depth features of the image objects are extracted, and local-scale features (e.g., average degree, average path length, graph diameter, average clustering coefficient, small-world effect) of the subgraphs are extracted to construct multi-scale depth features. The composite global graph structure is constructed by adopting adaptive weights, the graph embeddings are extracted via the graph convolutional network, and the node categories are predicted via SoftMax. For the Gaofen Image Dataset (GID‐15, a public benchmark dataset for land cover classification) and the 2017 China Computer Federation Remote Sensing Image Classification Dataset (CCF 2017, released in the 2017 China Computer Federation Big Data and Computational Intelligence Contest), as compared with the traditional method that considers only the global scale and the single-graph structure, this method improves the overall accuracy by 3.83% and 3.46%, respectively, and increases the kappa coefficient by 0.0681 and 0.0637, respectively, which indicates its effectiveness.

  • Research Article
  • 10.1029/2025jh000985
Earthquake Source Depth Determination Using Single Station Waveforms and Deep Learning
  • Feb 1, 2026
  • Journal of Geophysical Research: Machine Learning and Computation
  • Wenda Li + 1 more

Abstract In areas with limited station coverage, earthquake depth constraints are much less accurate than their latitude and longitude. Traditional travel‐time‐based location methods struggle to constrain depths due to imperfect station distribution and the strong trade‐off between source depth and origin time. Identifying depth phases at regional distances is usually hindered by strong wave scattering, which is particularly challenging for low‐magnitude events. Extracting effective depth features from single or sparse stations to enhance depth constraints is a pressing challenge. Deep learning algorithms, capable of extracting various features from seismic waveforms, including phase arrivals, amplitudes, and frequency, offer promising constraints to earthquake depths. In this work, we propose a novel depth feature extraction network (named VGGDepth), which directly maps seismic waveforms to earthquake depth using single‐station three‐component waveforms. The network structure is adapted from VGG16 in computer vision. It is designed to take single‐station three‐component waveforms as inputs and produce depths as outputs. Two scenarios are considered in our model development: (a) training and testing solely on the same station, and (b) generalizing by training and testing on different seismic stations within a particular region. We demonstrate the efficacy using seismic data from the 2016–2017 Central Apennines, Italy earthquake sequence. Results demonstrate that earthquake depths can be estimated from single stations with uncertainties of hundreds of meters. These uncertainties are further reduced by averaging results from multiple stations. Our method shows strong potential for earthquake depth determination, particularly for events recorded by single or sparsely distributed stations, such as historically instrumented earthquakes.

  • Research Article
  • 10.1109/jbhi.2025.3593487
Feasibility Study of a Diffusion-Based Model for Cross-Modal Generation of Knee MRI From X-Ray: Integrating External Radiographic Feature Information.
  • Feb 1, 2026
  • IEEE journal of biomedical and health informatics
  • Zhe Wang + 8 more

Knee osteoarthritis (KOA) is a prevalent musculoskeletal disorder, often diagnosed using X-rays due to its cost-effectiveness. While Magnetic Resonance Imaging (MRI) provides superior soft tissue visualization and serves as a valuable supplementary diagnostic tool, its high cost and limited accessibility significantly restrict its widespread use. To explore the feasibility of bridging this imaging gap, we conducted a feasibility study leveraging a diffusion-based model that uses an X-ray image as conditional input, alongside target depth and additional patient-specific feature information, to generate corresponding MRI sequences. Our findings demonstrate that the MRI volumes generated by our approach are not only visually closer to real MRI scans compared with other methods but also achieve the highest quantitative performance in terms of Peak Signal-to-Noise Ratio (PSNR) and Structural Similarity Index (SSIM). Furthermore, by increasing the number of inference steps to interpolate between slice depths, we enhance the continuity of the generated volume, achieving higher adjacent slice correlation coefficients. Through ablation studies, we further validate that integrating supplemental patient-specific information, beyond what X-rays alone can provide, enhances the accuracy and clinical relevance of the generated MRI, which underscores the potential of leveraging external patient-specific information to improve the performance of the MRI generation.

  • Research Article
  • 10.1109/jphot.2025.3645783
Multi-Temporal Resolution-Guided 3-D Imaging for Array-Based Single-Photon LiDAR
  • Feb 1, 2026
  • IEEE Photonics Journal
  • Kai Qiao + 9 more

Using a 1550 nm array-based single-photon LiDAR system, we demonstrated depth profiling of both static and dynamic targets up to a distance of 10 kilometers. The system comprises a 1550 nm pulsed laser source, a bistatic optical transceiver system, and a 64×64 InGaAs/InP Single Photon Avalanche Diode (SPAD) array camera, with an angular resolution of 20 μrad. By employing a recovery optimization algorithm guided by multi-scale time resolution, we utilized unsupervised learning methods to achieve three-dimensional (3D) image segmentation. Subsequently, we accomplished pixel-level algorithm matching, facilitating efficient long-range 3D imaging reconstruction with significantly reduced binary frame data. Notably, after offline processing of real point cloud data collected by our system, we obtained depth images of various targets within a 4 to 10 km range. Furthermore, we successfully captured dynamic 3D video of targets at a frame rate exceeding 50 fps. The video was reconstructed using offline processing with an average of fewer than 2 photons returned per pixel. These depth results highlight the potential of the proposed system and reconstructed method for depth profiling, feature extraction, and target recognition of distant static and dynamic targets.

  • Research Article
  • 10.3390/math14030421
Probabilistic Indoor 3D Object Detection from RGB-D via Gaussian Distribution Estimation
  • Jan 26, 2026
  • Mathematics
  • Hyeong-Geun Kim

Conventional object detectors represent each object by a deterministic bounding box, regressing its center and size from RGB images. However, such discrete parameterization ignores the inherent uncertainty in object appearance and geometric projection, which can be more naturally modeled as a probabilistic density field. Recent works have introduced Gaussian-based formulations that treat objects as distributions rather than boxes, yet they remain limited to 2D images or require late fusion between image and depth modalities. In this paper, we propose a unified Gaussian-based framework for direct 3D object detection from RGB-D inputs. Our method is built upon a vision transformer backbone to effectively capture global context. Instead of separately embedding RGB and depth features or refining depth within region proposals, our method takes a full four-channel RGB-D tensor and predicts the mean and covariance of a 3D Gaussian distribution for each object in a single forward pass. We extend a pretrained vision transformer to accept four-channel inputs by augmenting the patch embedding layer while preserving ImageNet-learned representations. This formulation allows the detector to represent both object location and geometric uncertainty in 3D space. By optimizing divergence metrics such as the Kullback–Leibler or Bhattacharyya distances between predicted and target distributions, the network learns a physically consistent probabilistic representation of objects. Experimental results on the SUN RGB-D benchmark demonstrate that our approach achieves competitive performance compared to state-of-the-art point-cloud-based methods while offering uncertainty-aware and geometrically interpretable 3D detections.

  • Research Article
  • 10.1016/j.neunet.2026.108579
DMDNet: Dual-branch multi-modal deep fusion network for V-D-T salient object detection.
  • Jan 1, 2026
  • Neural networks : the official journal of the International Neural Network Society
  • Yaoqi Sun + 3 more

DMDNet: Dual-branch multi-modal deep fusion network for V-D-T salient object detection.

  • Research Article
  • 10.4236/jilsa.2026.181002
A Multi-Modal Approach for Arabic Sign Language Gesture Recognition Using Deep Learning
  • Jan 1, 2026
  • Journal of Intelligent Learning Systems and Applications
  • Nouf Alharbi

This paper proposes a multi-modal deep learning framework for Arabic Sign Language (ArSL) recognition, addressing the challenges of both static and dynamic gesture recognition. The framework integrates spatial, temporal, and depth features using CNN, Transformer, and Depth-CNN models, combined via an attention-based fusion mechanism. A hierarchical recognition approach first classifies gestures as static or dynamic, then processes them with specialized models: MobileNetV3 for dynamic gestures and an MLP-KAN hybrid for static gestures. Evaluated on four ArSL datasets (Kaggle ASL, ArSL2018, DArSL50, KSU-ArSL), the system achieves 98.4% overall accuracy with real-time inference speeds of 0.007 seconds for static gestures and 0.02 seconds for dynamic gestures. Ablation studies confirm the importance of multi-modal fusion, with attention-based fusion improving accuracy by 11% compared to simple concatenation. The system demonstrates strong generalization across diverse datasets and conditions, making it suitable for real-world deployment in assistive communication technologies.

  • Research Article
  • 10.2298/csis251109012y
FNNMFF: Crop pests and diseases detection based on fuzzy neural network and multilevel feature fusion in remote sensing images
  • Jan 1, 2026
  • Computer Science and Information Systems
  • Shoulin Yin + 4 more

To solve the problem that the detection effect of crop pests and diseases is not ideal due to the complicated image background and the interference of irrelevant factors, this paper proposes a novel crop pests and diseases detection based on fuzzy neural network and multilevel feature fusion in remote sensing images. Firstly, the model is based on YOLOv5 and extracts the semantic level information of different depth features from the convolutional neural network, and then combines the weight aggregation module to learn the weight of each layer feature adaptively. Then the learned weights are loaded to the segmentation graphs obtained by sampling on each feature layer to obtain the final segmentation results. In this model, a fuzzy learning module is added to the skip connection part to remove noise features and alleviate the uncertainty between classes. The traditional cross entropy loss involves activating the output value with the Softmax function and calculating a weighted cross entropy loss with the label. If the weight of the cross entropy loss term is not adjusted, the model will tend to update the weight related to the background, which makes it difficult to deal with the category imbalance in remote sensing images. Therefore, we use focus loss to alleviate the problem of class imbalance in images. The results on public data sets show that the accuracy rate of the proposed model in this paper is over 95%, the recall rate is over 85%, and the average accuracy is 91.2%. In terms of F1, compared with other advanced methods, the presented method has achieved improvements of 6.6%, 8.5%, and 7.7% respectively. It shows that the new model has strong robustness and generalization for crop pest detection.

  • 1
  • 2
  • 3
  • 4
  • 5
  • 6
  • .
  • .
  • .
  • 10
  • 1
  • 2
  • 3
  • 4
  • 5

Popular topics

  • Latest Artificial Intelligence papers
  • Latest Nursing papers
  • Latest Psychology Research papers
  • Latest Sociology Research papers
  • Latest Business Research papers
  • Latest Marketing Research papers
  • Latest Social Research papers
  • Latest Education Research papers
  • Latest Accounting Research papers
  • Latest Mental Health papers
  • Latest Economics papers
  • Latest Education Research papers
  • Latest Climate Change Research papers
  • Latest Mathematics Research papers

Most cited papers

  • Most cited Artificial Intelligence papers
  • Most cited Nursing papers
  • Most cited Psychology Research papers
  • Most cited Sociology Research papers
  • Most cited Business Research papers
  • Most cited Marketing Research papers
  • Most cited Social Research papers
  • Most cited Education Research papers
  • Most cited Accounting Research papers
  • Most cited Mental Health papers
  • Most cited Economics papers
  • Most cited Education Research papers
  • Most cited Climate Change Research papers
  • Most cited Mathematics Research papers

Latest papers from journals

  • Scientific Reports latest papers
  • PLOS ONE latest papers
  • Journal of Clinical Oncology latest papers
  • Nature Communications latest papers
  • BMC Geriatrics latest papers
  • Science of The Total Environment latest papers
  • Medical Physics latest papers
  • Cureus latest papers
  • Cancer Research latest papers
  • Chemosphere latest papers
  • International Journal of Advanced Research in Science latest papers
  • Communication and Technology latest papers

Latest papers from institutions

  • Latest research from French National Centre for Scientific Research
  • Latest research from Chinese Academy of Sciences
  • Latest research from Harvard University
  • Latest research from University of Toronto
  • Latest research from University of Michigan
  • Latest research from University College London
  • Latest research from Stanford University
  • Latest research from The University of Tokyo
  • Latest research from Johns Hopkins University
  • Latest research from University of Washington
  • Latest research from University of Oxford
  • Latest research from University of Cambridge

Popular Collections

  • Research on Reduced Inequalities
  • Research on No Poverty
  • Research on Gender Equality
  • Research on Peace Justice & Strong Institutions
  • Research on Affordable & Clean Energy
  • Research on Quality Education
  • Research on Clean Water & Sanitation
  • Research on COVID-19
  • Research on Monkeypox
  • Research on Medical Specialties
  • Research on Climate Justice
Discovery logo
FacebookTwitterLinkedinInstagram

Download the FREE App

  • Play store Link
  • App store Link
  • Scan QR code to download FREE App

    Scan to download FREE App

  • Google PlayApp Store
FacebookTwitterTwitterInstagram
  • Universities & Institutions
  • Publishers
  • R Discovery PrimeNew
  • Ask R Discovery
  • Blog
  • Accessibility
  • Topics
  • Journals
  • Open Access Papers
  • Year-wise Publications
  • Recently published papers
  • Pre prints
  • Questions
  • FAQs
  • Contact us
Lead the way for us

Your insights are needed to transform us into a better research content provider for researchers.

Share your feedback here.

FacebookTwitterLinkedinInstagram
Cactus Communications logo

Copyright 2026 Cactus Communications. All rights reserved.

Privacy PolicyCookies PolicyTerms of UseCareers