Sample Imbalance Research Articles

Identification of interactions between chemical compounds and proteins is crucial for various applications, including drug discovery, target identification, network pharmacology, and elucidation of protein functions. Deep neural network-based approaches are becoming increasingly popular in efficiently identifying compound-protein interactions with high-throughput capabilities, narrowing down the scope of candidates for traditional labor-intensive, time-consuming and expensive experimental techniques. In this study, we proposed an end-to-end approach termed SPVec-SGCN-CPI, which utilized simplified graph convolutional network (SGCN) model with low-dimensional and continuous features generated from our previously developed model SPVec and graph topology information to predict compound-protein interactions. The SGCN technique, dividing the local neighborhood aggregation and nonlinearity layer-wise propagation steps, effectively aggregates K-order neighbor information while avoiding neighbor explosion and expediting training. The performance of the SPVec-SGCN-CPI method was assessed across three datasets and compared against four machine learning- and deep learning-based methods, as well as six state-of-the-art methods. Experimental results revealed that SPVec-SGCN-CPI outperformed all these competing methods, particularly excelling in unbalanced data scenarios. By propagating node features and topological information to the feature space, SPVec-SGCN-CPI effectively incorporates interactions between compounds and proteins, enabling the fusion of heterogeneity. Furthermore, our method scored all unlabeled data in ChEMBL, confirming the top five ranked compound-protein interactions through molecular docking and existing evidence. These findings suggest that our model can reliably uncover compound-protein interactions within unlabeled compound-protein pairs, carrying substantial implications for drug re-profiling and discovery. In summary, SPVec-SGCN demonstrates its efficacy in accurately predicting compound-protein interactions, showcasing potential to enhance target identification and streamline drug discovery processes.Scientific contributionsThe methodology presented in this work not only enables the comparatively accurate prediction of compound-protein interactions but also, for the first time, take sample imbalance which is very common in real world and computation efficiency into consideration simultaneously, accelerating the target identification and drug discovery process.

Read full abstract

Background and Objective:The classification of diabetic retinopathy (DR) aims to utilize the implicit information in images for early diagnosis, to prevent and mitigate the further worsening of the condition. However, existing methods are often limited by the need to operate within large, annotated datasets to show significant advantages. Additionally, the number of samples for different categories within the dataset needs to be evenly distributed, because the characteristic of sample imbalance distribution can lead to an excessive focus on high-frequency disease categories, while neglecting the less common but equally important disease categories. Therefore, there is an urgent need to develop a new classification method that can effectively alleviate the issue of sample distribution imbalance, thereby enhancing the accuracy of diabetic retinopathy classification. Methods:In this work, we propose MediDRNet, a dual-branch network model based on prototypical contrastive learning. This model adopts prototype contrastive learning, creating prototypes for different levels of lesions, ensuring they represent the core features of each lesion level. It classifies by comparing the similarity between data points and their category prototypes. Our dual-branch network structure effectively resolves the issue of category imbalance and improves classification accuracy by emphasizing subtle differences in retinal lesions. Moreover, our approach combines a dual-branch network with specific lesion-level prototypes for core feature representation and incorporates the convolutional block attention module for enhanced lesion feature identification. Results:Our experiments using both the Kaggle and UWF classification datasets have demonstrated that MediDRNet exhibits exceptional performance compared to other advanced models in the industry, especially on the UWF DR classification dataset where it achieved state-of-the-art performance across all metrics. On the Kaggle DR classification dataset, it achieved the highest average classification accuracy (0.6327) and Macro-F1 score (0.6361). Particularly in the classification tasks for minority categories of diabetic retinopathy on the Kaggle dataset (Grades 1, 2, 3, and 4), the model reached high classification accuracies of 58.08%, 55.32%, 69.73%, and 90.21%, respectively. In the ablation study, the MediDRNet model proved to be more effective in feature extraction from diabetic retinal fundus images compared to other feature extraction methods. Conclusions:This study employed prototype contrastive learning and bidirectional branch learning strategies, successfully constructing a grading system for diabetic retinopathy lesions within imbalanced diabetic retinopathy datasets. Through a dual-branch network, the feature learning branch effectively facilitated a smooth transition of features from the grading network to the classification learning branch, accurately identifying minority sample categories. This method not only effectively resolved the issue of sample imbalance but also provided strong support for the precise grading and early diagnosis of diabetic retinopathy in clinical applications, showcasing exceptional performance in handling complex diabetic retinopathy datasets. Moreover, this research significantly improved the efficiency of prevention and management of disease progression in diabetic retinopathy patients within medical practice. We encourage the use and modification of our code, which is publicly accessible on GitHub: https://github.com/ReinforceLove/MediDRNet.

Read full abstract

Sample Imbalance Research Articles

Related Topics

Articles published on Sample Imbalance

FCAN : Speech emotion recognition network based on focused contrastive learning

FIQ: A Fastener Inspection and Quantization Method Based on Mask FRCN

An oversampling method based on Gaussian Mixture Model for multi-bolt looseness monitoring using Lamb waves`

Semi-supervised soft sensor development based on dynamic dimensionality reduction-assisted large-scale pseudo label optimization and sample-weighted quality-relevant deep learning

DAJLENet: A neural network based on dual attention and joint learning for explainable heart failure adverse event prediction

Fault diagnosis of power equipment based on variational autoencoder and semi‐supervised learning

Lithology identification based on ramified structure model using generative adversarial network for imbalanced data

An Accurate Recognition Method for Landslides Based on a Semi-Supervised Generative Adversarial Network: A Case Study in Lanzhou City

Multifactorial Tomato Leaf Disease Detection Based on Improved YOLOV5

An end-to-end method for predicting compound-protein interactions based on simplified homogeneous graph convolutional network and pre-trained language model

SCV Filter: A Hybrid Deep Learning Model for SARS-CoV-2 Variants Classification

Tri-Flow-YOLO: Counter helps to improve cross-domain object detection

Road marking defect detection based on CFG_SI_YOLO network

A hybrid demultiplexing strategy that improves performance and robustness of cell hashing.

Not seeing the wood for the trees: Influences on random forest accuracy

MediDRNet: Tackling category imbalance in diabetic retinopathy classification with dual-branch learning and prototypical contrastive learning

An Improved Face Mask Detection Simulation Algorithm Based on YOLOv5 Model

MSGC-YOLO: An Improved Lightweight Traffic Sign Detection Model under Snow Conditions

A fault diagnosis framework based on heterogeneous ensemble learning for air conditioning chiller with unbalanced samples

Sample-imbalanced wafer map defects classification based on auxiliary classifier denoising diffusion probability model

Lead the way for us

Editage

Paperpal

R Discovery

Mind the Graph

Sample Imbalance Research Articles

Related Topics

Articles published on Sample Imbalance

FCAN : Speech emotion recognition network based on focused contrastive learning

FIQ: A Fastener Inspection and Quantization Method Based on Mask FRCN

An oversampling method based on Gaussian Mixture Model for multi-bolt looseness monitoring using Lamb waves`

Semi-supervised soft sensor development based on dynamic dimensionality reduction-assisted large-scale pseudo label optimization and sample-weighted quality-relevant deep learning

DAJLENet: A neural network based on dual attention and joint learning for explainable heart failure adverse event prediction

Fault diagnosis of power equipment based on variational autoencoder and semi‐supervised learning

Lithology identification based on ramified structure model using generative adversarial network for imbalanced data

An Accurate Recognition Method for Landslides Based on a Semi-Supervised Generative Adversarial Network: A Case Study in Lanzhou City

Multifactorial Tomato Leaf Disease Detection Based on Improved YOLOV5

An end-to-end method for predicting compound-protein interactions based on simplified homogeneous graph convolutional network and pre-trained language model

SCV Filter: A Hybrid Deep Learning Model for SARS-CoV-2 Variants Classification

Tri-Flow-YOLO: Counter helps to improve cross-domain object detection

Road marking defect detection based on CFG_SI_YOLO network

A hybrid demultiplexing strategy that improves performance and robustness of cell hashing.

Not seeing the wood for the trees: Influences on random forest accuracy

MediDRNet: Tackling category imbalance in diabetic retinopathy classification with dual-branch learning and prototypical contrastive learning

An Improved Face Mask Detection Simulation Algorithm Based on YOLOv5 Model

MSGC-YOLO: An Improved Lightweight Traffic Sign Detection Model under Snow Conditions

A fault diagnosis framework based on heterogeneous ensemble learning for air conditioning chiller with unbalanced samples

Sample-imbalanced wafer map defects classification based on auxiliary classifier denoising diffusion probability model