Ovarian Ultrasound Image Segmentation Algorithm with Fused Multi-Scale Features.
Ultrasound imaging technology plays a vital role in medical imaging. Ovarian ultrasound image segmentation is challenging due to the wide variation in lesion sizes caused by the cancer detection period and individual differences, as well as the noise from reflected wave interference. To address these challenges, we propose an innovative algorithm for ovarian ultrasound image segmentation that incorporates multi-scale features. This algorithm effectively processes image data with varying scales. By introducing a skip connection structure, the shallow image features are preserved. Additionally, in the feature fusion module, feature maps extracted from the backbone network are integrated layer by layer, enhancing the model's ability to parse multi-scale features. The proposed algorithm was tested on ovarian ultrasound images that had undergone noise reduction using different filtering methods. When compared to mainstream segmentation algorithms, our model achieved improvements in mIoU, mAcc, and aAcc metrics by 2.02, 1.09, and 0.34%, respectively. Overall, the algorithm outperformed the comparison methods, offering a new solution for ovarian ultrasound image segmentation.
512
- 10.1109/tmi.2020.3035253
- Feb 1, 2021
- IEEE Transactions on Medical Imaging
5173
- 10.1109/34.87344
- Jun 1, 1991
- IEEE Transactions on Pattern Analysis and Machine Intelligence
48
- 10.1016/j.isprsjprs.2019.06.010
- Jul 1, 2019
- ISPRS Journal of Photogrammetry and Remote Sensing
8365
- 10.1109/tpami.2016.2572683
- May 24, 2016
- IEEE Transactions on Pattern Analysis and Machine Intelligence
1080
- 10.1007/s11263-021-01515-2
- Sep 3, 2021
- International Journal of Computer Vision
19
- 10.1002/mp.15646
- Apr 18, 2022
- Medical Physics
4
- 10.1016/j.heliyon.2023.e14453
- Mar 1, 2023
- Heliyon
74
- 10.48550/arxiv.1706.05098
- Jun 15, 2017
2031
- 10.1038/s41551-018-0305-z
- Oct 1, 2018
- Nature Biomedical Engineering
58367
- 10.1007/978-3-319-24574-4_28
- Jan 1, 2015
- Research Article
1
- 10.1016/j.bspc.2024.106931
- Sep 27, 2024
- Biomedical Signal Processing and Control
MLFEU-NET: A Multi-scale Low-level Feature Enhancement Unet for breast lesions segmentation in ultrasound images
- Conference Article
20
- 10.1109/iciea.2019.8834193
- Jun 1, 2019
Stereo matching plays an indispensable part in autonomous driving, robotics and 3D scene reconstruction. We propose a novel deep learning architecture, which called CFP-Net, a Cross-Form Pyramid stereo matching network for regressing disparity from a rectified pair of stereo images. The network consists of three modules: Multi-Scale 2D local feature extraction module, Cross-form spatial pyramid module and Multi-Scale 3D Feature Matching and Fusion module. The Multi-Scale 2D local feature extraction module can extract enough multi-scale features. The Cross-form spatial pyramid module aggregates the context information in different scales and locations to form a cost volume. Moreover, it is proved to be more effective than SPP and ASPP in ill-posed regions. The Multi-Scale 3D feature matching and fusion module is proved to regularize the cost volume using two parallel 3D deconvolution structure with two different receptive fields. Our proposed method has been evaluated on the Scene Flow and KITTI datasets. It achieves state-of-the-art performance on the KITTI 2012 and 2015 benchmarks.
- Book Chapter
6
- 10.1007/978-3-030-46640-4_8
- Jan 1, 2020
The fully convolutional networks (FCNs) have been widely applied in numerous medical image segmentation tasks. However, tissue regions usually have large variations of shape and scale, so the ability of neural networks to learn multi-scale features is important to the segmentation performance. In this paper, we improve the network for multi-scale feature fusion, in the medical image segmentation by introducing two feature fusion modules: i) global attention multi-scale feature fusion module (GMF); ii) local dense multi-scale feature fusion module (LMF). GMF aims to use global context information to guide the recalibration of low-level features from both spatial and channel aspects, so as to enhance the utilization of effective multi-scale features and suppress the noise of low-level features. LMF adopts bottom-up top-down structure to capture context information, to generate semantic features, and to fuse feature information at different scales. LMF can integrate local dense multi-scale context features layer by layer in the network, thus improving the ability of network to encode interdependent relationships among boundary pixels. Based on the above two modules, we propose a novel medical image segmentation framework (GLF-Net). We evaluated the proposed network and modules on challenging brain tumor segmentation and pancreas segmentation datasets, and very competitive performance has been achieved.
- Research Article
3
- 10.1364/boe.522482
- Apr 26, 2024
- Biomedical Optics Express
Accurate and automated retinal vessel segmentation is essential for performing diagnosis and surgical planning of retinal diseases. However, conventional U-shaped networks often suffer from segmentation errors when dealing with fine and low-contrast blood vessels due to the loss of continuous resolution in the encoding stage and the inability to recover the lost information in the decoding stage. To address this issue, this paper introduces an effective full-resolution retinal vessel segmentation network, namely FRD-Net, which consists of two core components: the backbone network and the multi-scale feature fusion module (MFFM). The backbone network achieves horizontal and vertical expansion through the interaction mechanism of multi-resolution dilated convolutions while preserving the complete image resolution. In the backbone network, the effective application of dilated convolutions with varying dilation rates, coupled with the utilization of dilated residual modules for integrating multi-scale feature maps from adjacent stages, facilitates continuous learning of multi-scale features to enhance high-level contextual information. Moreover, MFFM further enhances segmentation by fusing deeper multi-scale features with the original image, facilitating edge detail recovery for accurate vessel segmentation. In tests on multiple classical datasets,compared to state-of-the-art segmentation algorithms, FRD-Net achieves superior performance and generalization with fewer model parameters.
- Research Article
- 10.1145/3736768
- Aug 12, 2025
- ACM Transactions on Multimedia Computing, Communications, and Applications
In environments where vision-based depth estimation systems, such as those utilizing infrared or imaging technologies, encounter limitations—particularly in low-light conditions—alternative approaches become essential. Echo depth estimation emerges as a compelling solution by leveraging the time delay of echoes to map the geometric structure of the surrounding environment. This method offers distinct advantages in specific scenarios, providing reliable data for accurate scene understanding and 3D reconstruction. Traditional echo depth estimation techniques primarily depend on spatial information captured by the encoder and depth predictions made by the decoder. However, these methods often fail to fully exploit the rich depth features present at different simultaneous frequencies. To address this challenge, we propose an echo depth estimation method via Attention-based Hierarchical Multi-scale Feature Fusion Network (AHMF-Net). This network is designed to extract spatial depth information from echo spectrograms across multiple scales and hierarchical levels, while fusing the most relevant information using an attention mechanism. AHMF-Net introduces two key modules in hierarchical levels: the Intra-layer Multi-scale Attention Feature Fusion (IMAF) module, which functions as the encoder to capture multi-scale features across varying granularities, and the Inter-layer Multi-Scale Detail Feature Fusion (IMDF) module, which integrates features from all encoding layers into the decoder to enable effective inter-layer multi-scale fusion. Additionally, the encoder incorporates an attention mechanism that enhances depth-related features by capturing channel dependencies at multiple scales. We evaluated AHMF-Net on the Replica, Matterport3D, and BatVision datasets, where it consistently outperformed state-of-the-art models in echo-based depth estimation, demonstrating superior accuracy and robustness. The source code is publicly available at https://github.com/wjzhang-ai/AHMF-Net .
- Research Article
1
- 10.1007/s11517-021-02411-0
- Aug 12, 2021
- Medical & Biological Engineering & Computing
The segmentation of ultrasound (US) images is steadily growing in popularity, owing to the necessity of computer-aided diagnosis (CAD) systems and the advantages that this technique shows, such as safety and efficiency. The objective of this work is to separate the lesion from its background in US images. However, most US images contain poor quality, which is affected by the noise, ambiguous boundary, and heterogeneity. Moreover, the lesion region may be not salient amid the other normal tissues, which makes its segmentation a challenging problem. In this paper, an US image segmentation algorithm that combines the learned probabilistic model with energy functionals is proposed. Firstly, a learned probabilistic model based on the generalized linear model (GLM) reduces the false positives and increases the likelihood energy term of the lesion region. It yields a new probability projection that attracts the energy functional toward the desired region of interest. Then, boundary indicator and probability statistical-based energy functional are used to provide a reliable boundary for the lesion. Integrating probabilistic information into the energy functional framework can effectively overcome the impact of poor quality and further improve the accuracy of segmentation. To verify the performance of the proposed algorithm, 40 images are randomly selected in three databases for evaluation. The values of DICE coefficient, the Jaccard distance, root-mean-square error, and mean absolute error are 0.96, 0.91, 0.059, and 0.042, respectively. Besides, the initialization of the segmentation algorithm and the influence of noise are also analyzed. The experiment shows a significant improvement in performance. A. Description of the proposed paper. B. The main steps involved in the proposed method.
- Book Chapter
- 10.1007/978-94-010-0678-1_12
- Jan 1, 2001
A novel method of multiscale texture analysis based on neural networks (NNs) has been developed and applied to the automated classification of benign and malignant focal liver lesions in ultrasound images. Our method is unique in the sense that it integrates a process of selection of multiscale texture features and a process of classification by a NN for effective classification. We developed an automated method that selects a set of multiscale texture features in the wavelet domain which maximize the performance of the NN for a given classification task. For the automated classification of benign and malignant focal liver lesions, regions of interest (ROIs) extracted from within the lesions were decomposed into subimages by wavelet packets. Multiscale texture features that quantify the homogeneity of the echogenicity were calculated from these subimages and were combined by a NN. A subset of the multiscale features was selected that yielded the highest performance in the classification of lesions. In an analysis of a set of ROIs extracted from hemangiomas (benign lesions), and from hepatocellular carcinomas and metastases (malignant lesions), the multiscale features yielded a high performance in distinguishing the benign from the malignant lesions. Therefore, our new multiscale texture analysis method based on NNs has the promise of increasing the accuracy of diagnosis of focal liver lesions in ultrasound images.
- Research Article
2
- 10.1016/j.ultras.2023.107187
- Oct 18, 2023
- Ultrasonics
Automatic extraction and measurement of ultrasonic muscle morphological parameters based on multi-stage fusion and segmentation
- Research Article
97
- 10.1109/tgrs.2020.3044655
- Jan 5, 2021
- IEEE Transactions on Geoscience and Remote Sensing
Recent progress on remote sensing (RS) scene classification is substantial, benefiting mostly from the explosive development of convolutional neural networks (CNNs). However, different from the natural images in which the objects occupy most of the space, objects in RS images are usually small and separated. Therefore, there is still a large room for improvement of the vanilla CNNs that extract global image-level features for RS scene classification, ignoring local object-level features. In this article, we propose a novel RS scene classification method via enhanced feature pyramid network (EFPN) with deep semantic embedding (DSE). Our proposed framework extracts multiscale multilevel features using an EFPN. Then, to leverage the complementary advantages of the multilevel and multiscale features, we design a DSE module to generate discriminative features. Third, a feature fusion module, called two-branch deep feature fusion (TDFF), is introduced to aggregate the features at different levels in an effective way. Our method produces state-of-the-art results on two widely used RS scene classification benchmarks, with better effectiveness and accuracy than the existing algorithms. Beyond that, we conduct an exhaustive analysis on the role of each module in the proposed architecture, and the experimental results further verify the merits of the proposed method.
- Research Article
6
- 10.3390/rs15163935
- Aug 9, 2023
- Remote Sensing
Rock detection on the surface of celestial bodies is critical in the deep space environment for obstacle avoidance and path planning of space probes. However, in the remote and complex deep environment, rocks have the characteristics of irregular shape, being similar to the background, sparse pixel characteristics, and being easy for light and dust to affect. Most existing methods face significant challenges to attain high accuracy and low computational complexity in rock detection. In this paper, we propose a novel semantic segmentation network based on a hybrid framework combining CNN and transformer for deep space rock images, namely RockSeg. The network includes a multiscale low-level feature fusion (MSF) module and an efficient backbone network for feature extraction to achieve the effective segmentation of the rocks. Firstly, in the network encoder, we propose a new backbone network (Resnet-T) that combines the part of the Resnet backbone and the transformer block with a multi-headed attention mechanism to capture the global context information. Additionally, a simple and efficient multiscale feature fusion module is designed to fuse low-level features at different scales to generate richer and more detailed feature maps. In the network decoder, these feature maps are integrated with the output feature maps to obtain more precise semantic segmentation results. Finally, we conduct experiments on two deep space rock datasets: the MoonData and MarsData datasets. The experimental results demonstrate that the proposed model outperforms state-of-the-art rock detection algorithms under the conditions of low computational complexity and fast inference speed.
- Conference Article
- 10.1145/3446132.3446152
- Dec 24, 2020
This paper is mainly aimed at proposing a powerful feature fusion method for object detection. An exceptionally significant accuracy improvement is achieved by augmenting all multi-scale features by adding a finite amount of computation. Hence, we created our detector based on a fast detector on SSD [1] and called it Full Feature Fusion Network (F3N). Using several Feature Fusion modules, we fused low-level and high-level features by parallel low-high level sub-network with repeated information exchange across multi-scale features. We fused all the multi-scale features using concatenate and interpolate methods within several feature fusion modules. F3N achieves the new state of the art result for one-stage object detection. F3N with 512x512 input achieves 82.5% mAP (mean Average Precision) and 320x320 input yields 80.3% on the VOC2007 test, with 512x512 input achieving 81.1% and 320x320 input yielding 77.3% on the VOC2012 test. In MS COCO data set, 512x512 input obtains 33.9% and 320x320 input yields 30.4%. The accuracies are significantly enhanced compared to the current mainstream approaches such as SSD [1], DSSD [8], FPN [11], YOLO [6].
- Research Article
7
- 10.1007/s00521-020-05202-0
- Jul 29, 2020
- Neural Computing and Applications
Single-shot multibox detector (SSD), one of the top-performing object detection algorithms, has achieved both high accuracy and fast speed. However, its performance is limited by two factors: (1) anchors are generated uniformly over the image by predefined manners, and (2) multiscale features from the feature pyramid are used to detect objects independently. In this paper, we propose a single-shot augmentation detector, called SSADet, that significantly improves the detection accuracy of the original SSD with a slight decrease in speed. SSADet mainly consists of two modules, namely the anchor prediction module and the feature fusion module. These two modules aim to generate anchors with any scale and aspect ratio and fuse multiscale features from different layers, respectively. Specifically, we define an anchor generator whose parameter weights are predicted dynamically by a small neural network and then use the anchor generator to generate optimal anchors over the image in the anchor prediction module. In the feature fusion module, multiscale features from the feature pyramid are concatenated to generate a new feature pyramid through a set of downsampling and upsampling operations. The new feature pyramid takes the generated anchors as the input from the anchor prediction module to predict the final detection results. Extensive experiments are conducted to demonstrate the effectiveness of SSADet on the PASCAL VOC 2007, PASCAL VOC 2012 and MS COCO detection datasets. The experimental results show that SSADet achieves state-of-the-art detection performance with high efficiency.
- Preprint Article
- 10.21203/rs.3.rs-5416891/v1
- Nov 28, 2024
Effective detection of foreign object intrusions on railways is essential for ensuring safe railway operations. A complex and variable railway environment leads to high miss and false alarm rates, particularly in detecting small-scale foreign objects. To address this challenge, this study proposes a railway foreign object intrusion detection method based on multi-scale feature enhancement. The method employs an efficient dimension-aware diffusion fusion network, enhancing the capture and utilization of multi-scale feature information through diffusion and fusion in the intermediate layers of the backbone network. This approach improves the accuracy and robustness of detection of small-scale foreign objects. The technical approach is as follows. First, the Variable Size and Stride (VSS) module in the backbone network promotes adaptive multi-scale feature extraction. Second, the combined use of the Anchor-free Instance Feature Injection (AIFI) and High-Low (HiLo) attention mechanisms highlights key features and reduces background interference. Next, in the dimension-aware diffusion fusion (DDF) network, dimension-aware selective integration (DASI) facilitates nonlinear and non-sequential feature fusion, whereas convolution and attention fusion module (CAFM) highlights important features and suppresses irrelevant features. Finally, the enriched features are fed into the decoder for the precise detection and localization of railway foreign object intrusions. The proposed method was tested on the Railway Foreign Object Intrusion (RFI) dataset and the public CityPersons dataset. Experimental results demonstrate that our method outperforms the comparative detectors in detection performance.
- Research Article
8
- 10.1080/01431161.2023.2261153
- Oct 2, 2023
- International Journal of Remote Sensing
Land Use/Land Cover (LULC) classification has become increasingly important in various fields, including ecological and environmental protection, urban planning, and geological disaster monitoring. With the development of high-resolution remote sensing satellite technology, there is a growing focus on achieving precise LULC classification. However, the accuracy of fine-grained LULC classification is challenged by the high intra-class diversity and low inter-class separability inherent in high-resolution remote sensing images. To address this challenge, this paper proposes a novel multi-path feature fusion semantic segmentation model, called MPFFNet, which combines the segmentation results of convolutional neural networks with traditional filtering processes to achieve finer LULC classification. MPFFNet consists of three modules: the Improved Encoder Module (IEM) extracts contextual and spatial detail information through the backbone network, DASPP, and MFEAM; the Improved Decoder Module (IDM) utilizes the Cascade Feature Fusion (CFF) module to effectively merge shallow and deep information; and the Feature Fusion Module (FAM) enables dual-path feature fusion using a convolutional neural network and Gabor Filter. Experimental results on the large-scale classification set and the fine land-cover classification set of the Gaofen Image Dataset (GID) demonstrate the effectiveness of the proposed method, achieving mIoU scores of 81.02% and 77.83%, respectively. These scores outperform U-Net by 7.95% and 3.28%, respectively. Therefore, we believe that our model can deliver superior results in the task of LULC classification.
- Research Article
1
- 10.3390/s25072029
- Mar 24, 2025
- Sensors (Basel, Switzerland)
Aiming at the shortcomings of EEG emotion recognition models in feature representation granularity and spatiotemporal dependence modeling, a multimodal emotion recognition model integrating multi-scale feature representation and attention mechanism is proposed. The model consists of a feature extraction module, feature fusion module, and classification module. The feature extraction module includes a multi-stream network module for extracting shallow EEG features and a dual-scale attention module for extracting shallow EOG features. The multi-scale and multi-granularity feature fusion improves the richness and discriminability of multimodal feature representation. Experimental results on two datasets show that the proposed model outperforms the existing model.
- Research Article
2
- 10.1615/critrevbiomedeng.v53.i2.40
- Jan 1, 2025
- Critical reviews in biomedical engineering
- Research Article
- 10.1615/critrevbiomedeng.v53.i1.40
- Jan 1, 2025
- Critical reviews in biomedical engineering
- Research Article
- 10.1615/critrevbiomedeng.2025055746
- Jan 1, 2025
- Critical reviews in biomedical engineering
- Journal Issue
- 10.1615/critrevbiomedeng.v53.i3
- Jan 1, 2025
- Critical Reviews in Biomedical Engineering
- Journal Issue
- 10.1615/critrevbiomedeng.v53.i4
- Jan 1, 2025
- Critical Reviews in Biomedical Engineering
- Research Article
- 10.1615/critrevbiomedeng.2024055114
- Jan 1, 2025
- Critical reviews in biomedical engineering
- Journal Issue
- 10.1615/critrevbiomedeng.v53.i5
- Jan 1, 2025
- Critical Reviews in Biomedical Engineering
- Research Article
1
- 10.1615/critrevbiomedeng.2025057015
- Jan 1, 2025
- Critical reviews in biomedical engineering
- Journal Issue
- 10.1615/critrevbiomedeng.v53.i1
- Jan 1, 2025
- Critical Reviews in Biomedical Engineering
- Research Article
1
- 10.1615/critrevbiomedeng.2024053746
- Jan 1, 2025
- Critical reviews in biomedical engineering
- Ask R Discovery
- Chat PDF
AI summaries and top papers from 250M+ research sources.